Abzugeben ist das Jupyter Notebook mit dem verlangten Implementierungen, den entsprechenden Ausgaben, Antworten und Diskussionen/Beschreibungen. Das Notebook ist als .ipynb und als .html abzugeben.
In diesem Versuch sollen Kenntnisse in folgenden Themen vermittelt werden:
Machen Sie sich mit den Grundlagen von Pandas vertraut.
Erweitert numpy, ein Packet in Python um das Arbeiten mit Arrays (besonders Mehrdimensionalen) zu vereinfachen. Pandas macht das Arbeiten mit Arrays zum Managen von Daten (zum Beispiel der Umgang mit lückenhaften Daten) noch einfacher,vorallem mithilfe von der Verbesserung von Indizes.
Sind vergleichbar mit eindimensionalen Arrays mit einem konfigurierbaren Index.
serie.loc["Pizza"] oder mit eigen definierten integer Indizes serie.iloc[:3]verwendenSind vergleichbar mit zweidimensionalen Arrays mit einem konfigurierbaren Index.
serie["column"] oder serie.columnserie["row"]serie["row":"row"] oder serie[0:1] aber besser mit serie.loc["row"] oder serie.iloc[0](Quelle: https://maucher.pages.mi.hdm-stuttgart.de/python4datascience/PD01Pandas.html)
Machen Sie sich mit Entscheidungsbäumen, Random Forest, Single Layer Perzeptron und Multi Layer Perzeptron vertraut.
(Quelle: Vorlesung Lernblätter KI)
# %conda install -y psycopg2
# %conda install -y sqlalchemy
# %conda install pandas
# %conda install matplotlib
# %conda install plotly
# %conda install seaborn
# %conda install scikit-learn
# %conda install numpy
import pandas as pd
import numpy as np
import re
import collections
import psycopg2
import sqlalchemy as sa
import json
from matplotlib import pyplot as plt
import plotly.express as px
import plotly.io as pio
import seaborn as sns
pio.renderers.default='notebook'
from sklearn.preprocessing import LabelBinarizer
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import classification_report
from sklearn.metrics import plot_confusion_matrix
from sklearn.metrics import ConfusionMatrixDisplay, confusion_matrix
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error, median_absolute_error, r2_score
from sklearn.linear_model import SGDRegressor
from sklearn.neural_network import MLPRegressor
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import RandomizedSearchCV
from sklearn import tree
In diesem ersten Teil des Versuchs sollen die relevanten Daten aus dem .csv-File eingelesen und in einer PostgreSQL-Tabelle abgelegt werden. Das benötigte File Fahrzeuginformationen.csv liegt im aktuellen Verzeichnis.
Laden Sie die .csv-Datei in einen Pandas Dataframe.
Zeigen Sie für den angelegten Dataframe
Zeigen Sie mit der Pandas-Dataframe Methode info(), den Datentyp aller Spalten an. Der Typ der Spalte CO2-Emissionen ist tatsächlich kein numerischer Typ. Finden Sie heraus warum das so ist. Beheben Sie den Fehler und sorgen Sie dafür, dass auch diese Spalte einen numerischen Typ hat.
Schreiben Sie den im vorigen Schritt angepassten Dataframe mit der Pandas Methode to_sql() in eine Datenbanktabelle mit dem Namen vehicledata.
Die extra parameter wie sep, header... kann man auch weglassen.
Eine Alternative zu einem Dataframe ist eine Series, diese ergibt hier aber keinen Sinn, da sie nur eindimensional ist z.B. wäre eine einzige Spalte des Dataframes eine Series.
# Load csv
cars_csv = pd.read_csv("Fahrzeuginformationen.csv", sep=",", header=0, index_col = False)
# Convert to dataframe
cars = pd.DataFrame(cars_csv)
Die funktion head() ohne Parameter zeigt die ersten 5 Zeilen an.
Mithilfe von Display statt Print können wir die Tabelle schöner darstellen.
# Display first 10 rows
display(cars.head(10))
| HST Benennung | HT Benennung | UT Benennung | Karosserie | Neupreis Brutto | Produktgruppe | Kraftstoffart | Schadstoffklasse | CCM | KW | ... | Zuladung | Zulässiges GG | Länge | Breite | Höhe | CO2-Emissionen | Min Energieeffizienzklasse | Antrieb | KSTA Motor | HST-HT Benennung | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Volkswagen | T6 Bus (SG)(05.2015->) | Multivan Trendline | Bs | 37962 | T5-Klasse Pkw | BS | E6 | 1896 | 112 | ... | 905 | 2967.615635 | 4852 | 1849 | 2019 | 218 | D | FA | STANDARD ->B | Volkswagen-T6 Bus (SG)(05.2015->) |
| 1 | Volkswagen | T6 Bus (SG)(05.2015->) | Multivan Comfortline | Bs | 45294 | T5-Klasse Pkw | BS | E6 | 1990 | 110 | ... | 753 | 3061.848723 | 4859 | 1827 | 1938 | 218 | D | FA | STANDARD ->B | Volkswagen-T6 Bus (SG)(05.2015->) |
| 2 | Volkswagen | T6 Bus (SG)(05.2015->) | Multivan Generation Six | Bs | 48675 | T5-Klasse Pkw | BS | E6 | 1943 | 110 | ... | 768 | 3018.887414 | 4788 | 1823 | 1990 | 218 | D | FA | STANDARD ->B | Volkswagen-T6 Bus (SG)(05.2015->) |
| 3 | Volkswagen | T6 Bus (SG)(05.2015->) | Multivan 70 Jahre Bulli | Bs | 47201 | T5-Klasse Pkw | BS | E6 | 2013 | 110 | ... | 1007 | 3096.198902 | 4927 | 1952 | 1935 | 210 | D | FA | STANDARD ->B | Volkswagen-T6 Bus (SG)(05.2015->) |
| 4 | Volkswagen | T6 Bus (SG)(05.2015->) | Multivan Join | Bs | 49453 | T5-Klasse Pkw | BS | E6 | 1945 | 112 | ... | 972 | 3068.590854 | 4916 | 1872 | 2026 | 210 | D | FA | STANDARD ->B | Volkswagen-T6 Bus (SG)(05.2015->) |
| 5 | Volkswagen | T6 Bus (SG)(05.2015->) | Multivan PanAmericana | Bs | 50795 | T5-Klasse Pkw | BS | E6 | 1938 | 109 | ... | 823 | 3046.890761 | 4886 | 1895 | 1933 | 210 | D | FA | STANDARD ->B | Volkswagen-T6 Bus (SG)(05.2015->) |
| 6 | Volkswagen | T6 Bus (SG)(05.2015->) | Multivan Edition | Bs | 51605 | T5-Klasse Pkw | BS | E6 | 1956 | 111 | ... | 724 | 2957.083511 | 4658 | 1946 | 1954 | 210 | D | FA | STANDARD ->B | Volkswagen-T6 Bus (SG)(05.2015->) |
| 7 | Volkswagen | T6 Bus (SG)(05.2015->) | Multivan Join lang | Bs | 54560 | T5-Klasse Pkw | BS | E6 | 1946 | 110 | ... | 960 | 3099.520813 | 5162 | 1883 | 2000 | 212 | D | FA | STANDARD ->B | Volkswagen-T6 Bus (SG)(05.2015->) |
| 8 | Volkswagen | T6 Bus (SG)(05.2015->) | Multivan Highline | Bs | 57729 | T5-Klasse Pkw | BS | E6 | 1966 | 106 | ... | 707 | 3033.083391 | 4994 | 1871 | 1980 | 218 | D | FA | STANDARD ->B | Volkswagen-T6 Bus (SG)(05.2015->) |
| 9 | Volkswagen | T6 Bus (SG)(05.2015->) | Multivan Business | Bs | 97850 | T5-Klasse Pkw | BS | E6 | 2029 | 106 | ... | 605 | 3006.976797 | 4948 | 1900 | 1931 | 218 | D | FA | STANDARD ->B | Volkswagen-T6 Bus (SG)(05.2015->) |
10 rows × 25 columns
print("Reihen", len(cars.axes[0])) # Alternative: print('total rows: ', len(cars))
print("Spalten", len(cars.axes[1])) # Alternative: print('total columns: ', len(cars.columns))
Reihen 24194 Spalten 25
Bei der Suche nach NaNs ist uns aufgefallen, dass es keine gibt. Alle Resultate, ob mit isna oder isnull, geben keine NaNs aus.
# Anzahl NaNs pro Spalte
display(cars.isna().sum()) # Alternative: cars.isnull().sum()
HST Benennung 0 HT Benennung 0 UT Benennung 0 Karosserie 0 Neupreis Brutto 0 Produktgruppe 0 Kraftstoffart 0 Schadstoffklasse 0 CCM 0 KW 0 HST PS 0 Getriebeart 0 Getriebe Benennung 0 Anzahl der Türen 0 Leergewicht 0 Zuladung 0 Zulässiges GG 0 Länge 0 Breite 0 Höhe 0 CO2-Emissionen 0 Min Energieeffizienzklasse 0 Antrieb 0 KSTA Motor 0 HST-HT Benennung 0 dtype: int64
info() kann auch genutzt werden, um die vorherige Aufgabe zu lösen. Teil der Ausgabe sind nämlich die Information über Reihen (entries) und columns.
cars["CO2-Emissionen"] = pd.to_numeric(cars["CO2-Emissionen"]) ist die Funktion um die Spalte umzuwandeln. Das Problem ist wenn man das ausführt, dass dort ein Fehler Unable to parse string "9,2" geworfen wird. Das heißt in der Spalte sind Werte enthalten die wir nicht automatisch konvertieren können, da der Python interpreter die Float Werte nur erkennt wenn sie wie im Englischen mit einem '.' beschrieben werden (9,2 --> 9.2).
In Zeile 20 sieht man, dass der Datentyp ein Object ist.
Wir schauen uns erst einmal an wie viele Werte betroffen sind, um zu schauen ob es noch anderen Komma trenner (z.B bei Tausend) gibt. Glücklicherweise erkennen wir dort nur 6 Werte die alle mit dem Format x,x geschrieben werden. Um das Problem zu beheben ersetzen wir die komplette CO2-Emissionen Spalte dort wo "," verwendet werden mit einem "." und konvertieren diese gleich mit astype(float) in einen Float64.
cars.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 24194 entries, 0 to 24193 Data columns (total 25 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 HST Benennung 24194 non-null object 1 HT Benennung 24194 non-null object 2 UT Benennung 24194 non-null object 3 Karosserie 24194 non-null object 4 Neupreis Brutto 24194 non-null int64 5 Produktgruppe 24194 non-null object 6 Kraftstoffart 24194 non-null object 7 Schadstoffklasse 24194 non-null object 8 CCM 24194 non-null int64 9 KW 24194 non-null int64 10 HST PS 24194 non-null int64 11 Getriebeart 24194 non-null object 12 Getriebe Benennung 24194 non-null object 13 Anzahl der Türen 24194 non-null int64 14 Leergewicht 24194 non-null int64 15 Zuladung 24194 non-null int64 16 Zulässiges GG 24194 non-null float64 17 Länge 24194 non-null int64 18 Breite 24194 non-null int64 19 Höhe 24194 non-null int64 20 CO2-Emissionen 24194 non-null object 21 Min Energieeffizienzklasse 24194 non-null object 22 Antrieb 24194 non-null object 23 KSTA Motor 24194 non-null object 24 HST-HT Benennung 24194 non-null object dtypes: float64(1), int64(10), object(14) memory usage: 4.6+ MB
# Handle conversion of column CO2 Emissionen
if cars["CO2-Emissionen"].dtype == object:
# Print all non numeric values, see issues are 6 rows with values like 9,2
non_numeric = re.compile(r'[^\d.]+')
display(cars.loc[cars["CO2-Emissionen"].str.contains(non_numeric)]) # https://stackoverflow.com/a/40790077
# Replace all , to convert 9,2 --> 9.2
cars["CO2-Emissionen"] = cars["CO2-Emissionen"].str.replace(',', '.').astype(float)
print("Type of CO2-Emissionen", cars["CO2-Emissionen"].dtype)
| HST Benennung | HT Benennung | UT Benennung | Karosserie | Neupreis Brutto | Produktgruppe | Kraftstoffart | Schadstoffklasse | CCM | KW | ... | Zuladung | Zulässiges GG | Länge | Breite | Höhe | CO2-Emissionen | Min Energieeffizienzklasse | Antrieb | KSTA Motor | HST-HT Benennung | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 16886 | Toyota | C-HR (X10) | Hybrid Business Edition | SUV | 32097 | Kompakt-SUV / Geländewagen | 08 | 6T | 1973 | 131 | ... | 449 | 1904.835675 | 4408 | 1808 | 1541 | 9,2 | A+ | FA | HYBRID ->B | Toyota-C-HR (X10) |
| 16888 | Toyota | C-HR (X10) | Hybrid Team D | SUV | 33769 | Kompakt-SUV / Geländewagen | 08 | 6T | 1948 | 134 | ... | 444 | 1912.584563 | 4453 | 1836 | 1558 | 9,2 | A+ | FA | HYBRID ->B | Toyota-C-HR (X10) |
| 16893 | Toyota | C-HR (X10) | Hybrid Lounge | SUV | 37758 | Kompakt-SUV / Geländewagen | 08 | 6T | 1961 | 136 | ... | 467 | 1921.574295 | 4440 | 1775 | 1545 | 9,2 | A+ | FA | HYBRID ->B | Toyota-C-HR (X10) |
| 16895 | Toyota | C-HR (X10) | Hybrid Style Selection | SUV | 37180 | Kompakt-SUV / Geländewagen | 08 | 6T | 1926 | 139 | ... | 449 | 1961.793375 | 4459 | 1817 | 1513 | 9,2 | A+ | FA | HYBRID ->B | Toyota-C-HR (X10) |
| 16897 | Toyota | C-HR (X10) | Hybrid Orange Edition | SUV | 40538 | Kompakt-SUV / Geländewagen | 08 | 6T | 1908 | 128 | ... | 427 | 1974.373493 | 4399 | 1799 | 1544 | 9,2 | A+ | FA | HYBRID ->B | Toyota-C-HR (X10) |
| 18186 | Seat | Ateca (KH7)(03.2016->) | Style | SUV | 25885 | Kompakt-SUV / Geländewagen | BS | E6 | 1396 | 106 | ... | 638 | 1902.668348 | 4254 | 1816 | 1597 | 8,5 | B | FA | STANDARD ->B | Seat-Ateca (KH7)(03.2016->) |
6 rows × 25 columns
Type of CO2-Emissionen float64
Wir benutzen eine config.json die wir im .gitignore notiert haben, damit die Passwortdaten nicht im Code gespeichert werden.
Außerdem haben wir unterschiedliche Wege gefunden formatierte Strings zu benutzen. ob mit %s... oder mit f-strings. Persönlich finden wir f-strings besser.
Wir benutzen den inspector um zu überprüfen, ob die Tabelle bereits existiert, ansonsten würde die folgende Funktion eine Warnung werfen.
Die vorgeschlagene Methode engine.has_table ist deprecated.
# Get json config
with open('./config.json') as f:
config = json.load(f)
# Connect to database
connection_str =f'postgresql://{config["user"]}:{config["password"]}@localhost:5432/{config["database"]}'
print(f'Connect to database {config["database"]}')
# Create engine
engine = sa.create_engine(connection_str)
print(engine)
# Create table cars
inspector = sa.inspect(engine)
if not inspector.has_table("vehicledata"):
cars.to_sql(name='vehicledata',index=True, index_label='index',con=engine)
else:
print("Table already exists")
Connect to database datamining Engine(postgresql://joy:***@localhost:5432/datamining)
read_sql_query() um 3 für Sie interessante Datenbankabfragen zu implementieren. Die Resultate der Abfragen werden in einen Pandas Dataframe geschrieben. Zeigen Sie diese an. Hierbei ist uns aufgefallen, dass Kommunikation extrem wichtig ist. Wir haben sehr lange gebraucht, um herauszufinden, dass man 3 Anführungszeichen beim Arbeiten mit Strings in SQL-Queries benutzen muss. Hätten wir einfach mal aufmerksam Mails gelesen, hätten wir uns unmengen an Zeit gespart.
Lustigerweise haben wir eine andere Lösung gefunden die auch funktioniert aber bei weitem uneleganter aussieht: pd.read_sql_query("SELECT * FROM cars WHERE \"HST Benennung\" LIKE '%%Volkswagen%%'", engine)
Mit der folgenden Query bekommen wir eine Liste an Hersteller Namen und die Anzahl an Einträgen pro Hersteller in sortierter Reihenfolge. Es gibt insgesamt 42 Hersteller und die meisten Einträge kommen erwarteterweise von BMW, Mercedes und Volkswagen.
pd.read_sql_query('SELECT "HST Benennung", COUNT(*) as "Anzahl" FROM vehicledata GROUP BY "HST Benennung" ORDER BY "Anzahl" DESC', engine)
| HST Benennung | Anzahl | |
|---|---|---|
| 0 | BMW | 4030 |
| 1 | Mercedes-Benz | 3055 |
| 2 | Volkswagen | 2185 |
| 3 | Opel | 1701 |
| 4 | Audi | 1362 |
| 5 | Ford | 1337 |
| 6 | Skoda | 1247 |
| 7 | Citroen | 846 |
| 8 | Peugeot | 769 |
| 9 | Volvo | 761 |
| 10 | Hyundai | 645 |
| 11 | Renault | 625 |
| 12 | Seat | 618 |
| 13 | Jaguar | 573 |
| 14 | Land Rover | 551 |
| 15 | Toyota | 516 |
| 16 | Fiat | 465 |
| 17 | Kia | 445 |
| 18 | Mazda | 303 |
| 19 | Nissan | 278 |
| 20 | Dacia | 211 |
| 21 | Honda | 176 |
| 22 | Ssangyong | 171 |
| 23 | Mitsubishi | 169 |
| 24 | MINI | 160 |
| 25 | Lexus | 159 |
| 26 | Porsche | 145 |
| 27 | DS | 124 |
| 28 | Alfa Romeo | 113 |
| 29 | Subaru | 95 |
| 30 | Jeep | 76 |
| 31 | Smart | 62 |
| 32 | Infiniti | 54 |
| 33 | Lada | 52 |
| 34 | Alpina | 44 |
| 35 | Suzuki | 24 |
| 36 | Abarth | 20 |
| 37 | Chevrolet | 14 |
| 38 | Cadillac | 7 |
| 39 | Cupra | 2 |
| 40 | Corvette | 2 |
| 41 | Borgward | 2 |
Mit der folgenden Query betrachten wir die Anzahl an Schadstoffklassen pro Kraftstoffart. Interessant hierbei ist, dass obwohl es 6 verschiedene Schadstoffklassen gibt, nicht bei allen Kraftstoffarten alle Schadstoffklassen vertreten sind.
pd.read_sql_query("""SELECT "Kraftstoffart", "Schadstoffklasse", COUNT("Schadstoffklasse") AS "Häufigkeit" FROM vehicledata GROUP BY "Kraftstoffart","Schadstoffklasse" ORDER BY "Kraftstoffart","Häufigkeit" DESC """, engine)
| Kraftstoffart | Schadstoffklasse | Häufigkeit | |
|---|---|---|---|
| 0 | 06 | E6 | 65 |
| 1 | 06 | 6T | 44 |
| 2 | 07 | E6 | 52 |
| 3 | 07 | 6T | 33 |
| 4 | 08 | 6T | 414 |
| 5 | 08 | E6 | 160 |
| 6 | 08 | 6D | 101 |
| 7 | 08 | E5 | 3 |
| 8 | 09 | E6 | 8 |
| 9 | 10 | 6T | 63 |
| 10 | 10 | E6 | 11 |
| 11 | 10 | 6D | 5 |
| 12 | 25 | 6T | 12 |
| 13 | 25 | E6 | 8 |
| 14 | 26 | 6D | 3 |
| 15 | 26 | 6T | 1 |
| 16 | B1 | 6T | 2 |
| 17 | BN | 6T | 1 |
| 18 | BS | E6 | 5247 |
| 19 | BS | 6T | 5086 |
| 20 | BS | 6D | 742 |
| 21 | BS | E5 | 16 |
| 22 | D | 6T | 5551 |
| 23 | D | E6 | 5455 |
| 24 | D | 6D | 749 |
| 25 | D | E5 | 10 |
| 26 | E | E6 | 42 |
| 27 | E | 6T | 41 |
| 28 | S | 6T | 1 |
| 29 | SP | E6 | 130 |
| 30 | SP | 6T | 130 |
| 31 | SP | 6D | 8 |
Mit der folgenden Query betrachten wir die höchste CO2-Emission pro Hersteller. Eigentlich haben wir erwartet, dass wir hierbei Volkswagen aufgrund des Diesesskandals relativ weit oben sehen. Scheinbar gab es keine so große Auswirkung wie wir erwartet haben, oder der Datensatz ist nicht aktuell.
Chevrolet und Jeep waren aber wie erwartet oben mit dabei.
pd.read_sql_query('SELECT "HST Benennung",MAX("CO2-Emissionen") as "Maximum_CO2" FROM vehicledata GROUP BY "HST Benennung" ORDER BY "Maximum_CO2" DESC', engine)
| HST Benennung | Maximum_CO2 | |
|---|---|---|
| 0 | Mercedes-Benz | 397.0 |
| 1 | Chevrolet | 386.0 |
| 2 | Jeep | 338.0 |
| 3 | Ford | 320.0 |
| 4 | Nissan | 319.0 |
| 5 | Porsche | 317.0 |
| 6 | Audi | 309.0 |
| 7 | Cadillac | 298.0 |
| 8 | Land Rover | 298.0 |
| 9 | BMW | 296.0 |
| 10 | Infiniti | 293.0 |
| 11 | Corvette | 291.0 |
| 12 | Jaguar | 272.0 |
| 13 | Lexus | 263.0 |
| 14 | Subaru | 259.0 |
| 15 | Alpina | 254.0 |
| 16 | Mitsubishi | 246.0 |
| 17 | Fiat | 244.0 |
| 18 | Hyundai | 244.0 |
| 19 | Kia | 244.0 |
| 20 | Borgward | 233.0 |
| 21 | Alfa Romeo | 227.0 |
| 22 | Lada | 226.0 |
| 23 | Mazda | 219.0 |
| 24 | Volkswagen | 219.0 |
| 25 | Ssangyong | 217.0 |
| 26 | Toyota | 204.0 |
| 27 | Opel | 203.0 |
| 28 | Renault | 200.0 |
| 29 | Volvo | 187.0 |
| 30 | Citroen | 186.0 |
| 31 | Peugeot | 181.0 |
| 32 | Honda | 178.0 |
| 33 | Seat | 173.0 |
| 34 | MINI | 169.0 |
| 35 | Dacia | 169.0 |
| 36 | Skoda | 168.0 |
| 37 | Cupra | 168.0 |
| 38 | Abarth | 155.0 |
| 39 | Suzuki | 143.0 |
| 40 | DS | 138.0 |
| 41 | Smart | 123.0 |
describe() um sämtliche deskriptiven Statistiken anzuzeigen.numeric_features an, welche nur die Spaltennamen der numerischen Spalten enthält.categoric_features.HST_Benennung, Neupreis Brutto, CO2-Emissionen und Produktgruppe die Verteilung der Werte in einem Barplot bzw. Histogramm.Hierbei waren wir zuerst von nunique verwirrt, und dachten es bezieht sich auf not unique. Dabei bedeutet es aber number of unique im Sinne von Anzahl. Würde man das nicht benutzen, müsste man das Ganze in einem len(...) wrappen.
Uns ist aufgefallen, wir hätten gar kein for-loop benutzen müssen, denn natürlich geht auch einfach cars.nunique().
# Distinct count per column
for column in cars:
print(f"{column}: ", cars[column].nunique())
HST Benennung: 42 HT Benennung: 617 UT Benennung: 3782 Karosserie: 22 Neupreis Brutto: 19739 Produktgruppe: 28 Kraftstoffart: 14 Schadstoffklasse: 4 CCM: 2197 KW: 428 HST PS: 560 Getriebeart: 2 Getriebe Benennung: 103 Anzahl der Türen: 4 Leergewicht: 1833 Zuladung: 1208 Zulässiges GG: 23919 Länge: 2342 Breite: 651 Höhe: 1203 CO2-Emissionen: 279 Min Energieeffizienzklasse: 8 Antrieb: 3 KSTA Motor: 6 HST-HT Benennung: 617
Hierbei sieht man 8 Werte für jede Spalte, die ideal für einen Box-plot sind.
Der count Wert ist ist wie bei einer vorherigen Aufgabe immer die Anzahl der gesamten Einträge, da wir keine NaNs finden konnten.
Erstaunlich war auch, dass Autos ziemlich lang sind, das ist uns nie wirklich aufgefallen, da wir Entfernungen nicht so wahrnehmen wie Höhen.
Aufgrund der Maximalwerte von Gewicht, Länge und Zuladung gehen wir davon aus, dass zumindest kleine Laster oder Pickups im Datensatz vorhanden sein müssen.
Die Minimalwerte von 29 Kilogramm Leergewicht sehen aber durchaus suspekt aus.
# Using describe()
cars.describe().round()
| Neupreis Brutto | CCM | KW | HST PS | Anzahl der Türen | Leergewicht | Zuladung | Zulässiges GG | Länge | Breite | Höhe | CO2-Emissionen | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 24194.0 | 24194.0 | 24194.0 | 24194.0 | 24194.0 | 24194.0 | 24194.0 | 24194.0 | 24194.0 | 24194.0 | 24194.0 | 24194.0 |
| mean | 40301.0 | 1851.0 | 127.0 | 172.0 | 4.0 | 1589.0 | 579.0 | 2135.0 | 4561.0 | 1835.0 | 1579.0 | 136.0 |
| std | 22038.0 | 616.0 | 61.0 | 83.0 | 1.0 | 367.0 | 191.0 | 524.0 | 413.0 | 93.0 | 220.0 | 33.0 |
| min | 6835.0 | 831.0 | 42.0 | 51.0 | 2.0 | 29.0 | 58.0 | 0.0 | 2571.0 | 1394.0 | 1180.0 | 8.0 |
| 25% | 26910.0 | 1482.0 | 88.0 | 120.0 | 4.0 | 1381.0 | 483.0 | 1873.0 | 4339.0 | 1782.0 | 1446.0 | 115.0 |
| 50% | 36720.0 | 1941.0 | 111.0 | 151.0 | 5.0 | 1555.0 | 547.0 | 2075.0 | 4575.0 | 1829.0 | 1502.0 | 130.0 |
| 75% | 47632.0 | 2011.0 | 141.0 | 192.0 | 5.0 | 1745.0 | 615.0 | 2320.0 | 4801.0 | 1883.0 | 1647.0 | 153.0 |
| max | 287347.0 | 6706.0 | 511.0 | 705.0 | 5.0 | 10365.0 | 7726.0 | 15296.0 | 6958.0 | 2293.0 | 3023.0 | 397.0 |
Es gibt mehrere Möglichkeiten an das Ziel zu kommen. Unser erster Versuch auf Stackoverflow gab uns aber eine unschöne Lösung, nach genauerem Suchen fanden wir die include Option in select_dtypes.
Es gibt insgesamt 12 Numerische Werte, davon sind 10 Integers und 2 Floats inklusive der von uns in einer vorherigen Aufgabe angepassten CO2-Emissionen Spalte.
# numeric_features
# 10x int64
# 2x float64
numeric_features = cars.select_dtypes(include='number').columns # Alternative: numeric_features = cars._get_numeric_data().columns.values.tolist()
list(numeric_features)
['Neupreis Brutto', 'CCM', 'KW', 'HST PS', 'Anzahl der Türen', 'Leergewicht', 'Zuladung', 'Zulässiges GG', 'Länge', 'Breite', 'Höhe', 'CO2-Emissionen']
Durch das finden von include haben wir das exclude einfach ausprobiert und waren begeistert, dass es intuitiv funktioniert hat. Wie weiter oben bei der Benutzung von describe().round(), müssen wir wiedermals feststellen, dass pandas extrem viele Helferfunktionen hat die unglaublich intuitiv und praktisch sind.
# categoric_features 13 objects
categoric_features = cars.select_dtypes(exclude='number').columns
list(categoric_features)
['HST Benennung', 'HT Benennung', 'UT Benennung', 'Karosserie', 'Produktgruppe', 'Kraftstoffart', 'Schadstoffklasse', 'Getriebeart', 'Getriebe Benennung', 'Min Energieeffizienzklasse', 'Antrieb', 'KSTA Motor', 'HST-HT Benennung']
Unser erster Ansatz die Aufgabe zu lösen war zunächst eine Graph Libary herraussuchen und die Daten einfach mal zu plotten. Da war zunächst mal eine Schwierigkeit überhaupt eine passende Libary zu finden, weil es ziemlich viele auf dem Markt gibt. Zunächst haben wir uns für matplotlib entschieden, weil wir diese in einem Tutorial gefunden hatten. Später haben wir meist plotly.express benutzt, weil diese Libary unserer Meinung nach mehr Möglichkeiten (z.B mehr Graphtypen, Hovers und Interaktivität generell mit dem Graphen) geboten hat und die Standard Konfiguration visuell deutlich ansprechender war.
Bei unserem ersten Graphen mit einfachem Plotten stellten wir fest, dass die Verteilung nicht zu den Ergebnissen aus der vorherigen SQL-Query passt, z.B sollte BMW die meisten Einträge mit ca. 4000 haben, aber der Graph zeigt einen Maximalwert von 6000. Volkswagen, welcher eigentlich nur ca. 2000 Einträge haben sollte, zeigt 6000 Einträge.
# This solution is not working because data wasn't prepared
%matplotlib inline
plt.hist(cars['HST Benennung'])
plt.title('HST Benennung')
plt.xticks(rotation='vertical', fontsize = 'x-small') # prevent overlapping text
plt.show()
Uns ist dann relativ schnell aufgefallen, dass es an den kategorischen Werten liegen muss. Bei den numerischen Werten scheint es zu klappen, dass dort die Anzahl der Elemente und die x-Achsen Skala richtig erkannt werden. Bei kategorischen Werten muss man die Anzahl der jeweiligen Elemente anders herausbekommen und die jeweilige Anzahl als Daten für den Plot verwenden. Um diese zu bekommen haben wir collections benutzt.
Bei diesem Ergebnis kann man sehen, dass die Daten richtig sind, so wie unsere SQL-Query es auch gezeigt hat. Die größten Balken BMW und Mercedes-Benz stechen hier ein wenig heraus, wo hingegen die Hersteller Borgwart, Cupra und Corvette haben so wenige Einträge haben, man einen Balken kaum sieht.
# HST Benennung
# Get count of elements with 'collections'
hst_counts = collections.Counter(cars['HST Benennung'])
# Convert to dataframe
hst_dataframe = pd.DataFrame.from_dict(hst_counts, orient='index')
# Graph 1: with pandas dataframe plot
hst_dataframe.plot(kind='bar', title='HST Benennung', figsize=(10,5), legend=[])
plt.show()
# Graph 2: with plotly express
fig = px.histogram(cars, x="HST Benennung",title="HST Benennung")
fig.show("notebook")
Hier hatten wir das Problem mit den Kategorischen Daten nicht, da der Preis ja ein numerischer Wert ist und somit einfach in eine Skala gepackt werden kann.
Der Verlauf der Kurve ist zunächst steil nach oben und geht dann relativ schnell wieder steil nach unten. Daran können wir feststellen, dass in unserem Datensatz viele Fahrzeuge vorhanden sind, die einen Neupreis Brutto zwischen 20-40k haben. Ab 90k geht die Kurve stark gegen 0. Das bedeutet das wir weniger teurere Fahrzeuge haben.
Diese Kurve haben wir aber sehr ähnlich so erwartet.
# Neupreis Brutto
fig = px.histogram(cars, x="Neupreis Brutto",title="Neupreis Brutto")
fig.show("notebook")
Bei diesem Graphen sieht man das untere Mittelklasse/Kompaktklasse und Mittelklasse am häufigsten Vorkommen und auch deutlich häufiger als alle anderen Werte. Daran kann man einen Zusammenhang mit dem Neupreis Brutto Graphen sehen, da dieser die häufigsten Werte rund um Mittelklasse Wagen Preise hat.
Ansonsten sind die restlichen Werte alle entweder nah zwischen 1000-2000x Vorkommen und teilweise dann auch noch sehr niedrig, was meist Luxuriösere Produktgruppen sind. Die sind ja ebenfalls sehr gering Vertreten im Neupreis Brutto Graphen in den teureren Bereichen.
# Produktgruppe
pgr_counts = collections.Counter(cars['Produktgruppe'])
pgr_dataframe = pd.DataFrame.from_dict(pgr_counts, orient='index')
pgr_dataframe.plot(kind='bar', title='Produktgruppe', figsize=(9,4))
plt.show()
Der Graph ist ziemlich Normalverteilt um einen Mittelwert von ca. 120. Die meisten Werte liegen zwischen 100 und 200. Wobei man bei dem Rest unter 100 und über 200 kaum Werte hat. Um die 50 rum und ein bisschen darunter sieht man noch einen sehr kleinen Anstieg, der womöglich zu den Elektroautos gehört.
# CO2-Emissionen
fig = px.histogram(cars, x="CO2-Emissionen",title="CO2-Emissionen")
fig.show("notebook")
In diesem Abschnitt soll ein Klassifikator trainiert werden, welcher anhand von Eingabemerkmalen, wie Breite, Höhe, Gewicht usw. das zugehörige Fahrzeugsegment (Produktgruppe) vorhersagt.
In diesem Teilversuch sollen als Eingabemerkmale die zuvor in numeric_features definierten Spalten und die nicht-numerischen Spalten Antrieb, Kraftstoffart, KSTA Motor verwendet werden. Die Zielvariable (Ausgabe) stellt die Spalte Produktgruppe dar.

Eine generelle Beobachtung vorab: Aufgrund der Verteilung an Daten, sind die Informationen zu manchen Produktgruppen teilweise nicht aussagekräftig genug z.B. Luxusklasse Cabrio, da lässt sich durch die Kraftstoffart eventuell nicht auf die Klasse schließen.
Auf dem Graphen sieht man die Anzahl der jeweiligen Antriebe pro Fahrzeugklasse. Auffällig dabei sind die unterschiedlichen Mengen an Daten, bei der wir z.B. am meisten Information über die Kompaktklasse haben, aber kaum Einträge zu den Luxusklassen und Pickups. Manche Produktklassen wie z.B. Kleinwagen haben nur eine Art von Antrieb in unserem Datensatz vertreten.
Je weniger Diversität in den jeweiligen Klassen ist, desto besser kann man, bei genau dem Antrieb, auf die Klasse schließen, beziehungsweise eine Klasse ausschließen. Hat man z.B. einen HA Antrieb, können wir laut unserem Datensatz und der Menge an Daten, davon ausgehen, dass es sich nicht um einen Kleinwagen handelt, da wir keinen Eintrag von einem Kleinwagen mit HA Antrieb haben. Für den Pickup gilt das gleiche. Fraglich ist aber, ob durch die mangelnde Anzahl an Daten, es wirklich sicher darauf schließen lässt.
# Antrieb Graph
# Prepare data with group by Produktgruppe and Antrieb
antrieb_group = cars.groupby(["Produktgruppe", "Antrieb"]).size().unstack(fill_value=0)
# Plot
antrieb_group.plot.barh(stacked=True, figsize=(10,10), title="Antrieb in Produktgruppe")
plt.show()
Zum Code: Nach 10 Farbgebungen wiederholen sich die Farben, deswegen müssen wir mehr eigenständig definieren.
Die Informationsmenge in diesem Graphen pro Klasse ist offensichtlich gleich wie bei dem anderen Graphen.
Auffällig hierbei ist aber, das trotz der großen Menge an verschiedenen Kraftstoffarten, die Daten hauptsächlich durch D, BS und 08 vertreten sind.
# Kraftstoffart Graph
# Prepare data with group by Produktgruppe and Kraftstoffart
kraftstoffart_group = cars.groupby(["Produktgruppe", "Kraftstoffart"]).size().unstack(fill_value=0)
# Plot
kraftstoffart_group.plot.barh(stacked=True, figsize=(10,10), title="Kraftstoffart in Produktgruppe",color=['#ff11ff', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd', '#8c564b', '#e377c2', '#7f7f7f', '#bcbd22', '#07becf', '#100ecf','#05be0f','#87feff','#fcbfcf'])
plt.show()
Hier sind Standard Benzin und Standard Diesel am meisten vergeben. Durch die Korrelation von diesem und dem vorherigen Graph, sehen beide auch relativ gleich verteilt aus.
# KSTA Motor Graph
# Prepare data with group by Produktgruppe and Kraftstoffart
KSTA_group = cars.groupby(["Produktgruppe", "KSTA Motor"]).size().unstack(fill_value=0)
# Plot
KSTA_group.plot.barh(stacked=True, figsize=(10,10), title="KSTA Motor in Produktgruppe")
plt.show()

Auf den Plots sieht man die Verteilung der numerischen Werte. Man sieht den Mittelwert, das obere und untere Quartil, und vorallem die Ausreißer. Man sieht gerade bei KW, HST PSund CCM sehr viele Ausreißer. Manche Ausreißer lassen auf eventuelle Fehler in den Daten schließen. Z.B. Der Ausreißer in Zuladungen für Kleintransporter/PKW.
Das Leergewicht, die Zuladung und das zulässige GG, haben dagegen nur eine geringe Streuung. Den Plot der Türen können wir nur schwer intepretieren da bei diesem Wert auch wenig Varianz oder Verteilung vorhanden ist.
# Plot all graphs for numeric features
for column in numeric_features:
sns.boxplot(x=column, y="Produktgruppe", data=cars)
sns.set(rc={'figure.figsize':(10, 12)})
plt.tight_layout()
plt.show()
LängeHöheProduktgruppeLeergewichtNeupreis Brutto und HST-HT Benennung angezeigt werden. Auffällig hierbei ist wie simpel die Visualisierung ist.
Bei diesem Graphen haben wir ebenfalls das Problem der wiederholenden Farben, da man aber drüber hovern kann, und sie Lokal getrennt sind, haben wir keine eigenen Farben definiert.
Da die Minicars meist nur 2 Türen haben und wirklich klein sind, bilden sie eine kleine Insel im Graphen, weil der Schritt von 2 Türen zu 4 Türen eine größere Karosserie beansprucht.
Das Leergewicht korreliert mit Breite und Länge, dementsprechend sind unten links die leichtesten (kleine Punkte) und oben rechts (große Punkte) die schwersten Fahrzeuge.
Durch die Gruppenbildung der Farben, kann man deutliche Linien drumrum ziehen und so bei gewisser Länge und Breite auf sehr gute Prognosen zur Fahrzeugklasse stellen. Vorallem die Extremitäten wie Sprinter, Minicars und SUVs kann man gut vorhersagen.
cars_scatter = px.scatter(cars, x="Länge", y="Breite", color="Produktgruppe", size="Leergewicht", hover_data=["Neupreis Brutto", "HST-HT Benennung"])
cars_scatter.show()
X zu. Produktgruppe ein Label-Encoding mit scikit-learn LabelEncoder aus. Weisen Sie diese Daten dem 1-dimensionalen numpy-array y zu.categoric_values = ["Antrieb", "Kraftstoffart", "KSTA Motor"]
feature_names = []
# Initialize
lb = LabelBinarizer()
le = LabelEncoder()
Für das One Hot Encoding haben wir für die Kategorischen Werte einen LabelBinarizer verwendet. Damit haben wir die Daten mit fit_transform() One Hot Encoded, ein Helfer der fit() und transform() direkt kombiniert. Dazwischen haben wir uns noch die Klassen gemerkt und sie in einer array in der richtigen Reihenfolge gespeichert, um die Labels für später zu behalten.
Problem
Da wir in der darauffolgenden Aufgabe die Ergebnisse mit den numeric values zusammenfügen sollen, haben wir die kategorischen Ergebnisse ebenfalls zusammenfügen wollen. Eine Schwierigkeit dort war dann das Schema zu bestimmen wie die Arrays zusammengesetzt werden sollen. Dies haben wir durch Fragen herrausfinden können und ein kleines Beispiel Schema erstellt:
Eine Reihe entspricht dann jeweils einem Fahrzeug, also haben wir 24194 Reihen. Bei den Spalten sind die numerischen Werte ganz normal eine Spalte in der, der Wert eingetragen wird. Bei den Kategorischen Werten hat man so viele Spalten, wie die Anzahl der Werte von einem Kategorischen Feature. Zum Beispiel ein Feature Antrieb hat die Werte FA, HA, A. Also weiten sich die Spalten auf 3 aus und enthalten dort wo der Wert von dem jeweiligen Fahrzeug ist, eine 1. Die restlichen Spalten werden mit 0 gefüllt.
Bei diesem Vorgehen ist die Reihenfolge besonders wichtig.
Schmema
nf = numeric feature, cf = categoric feature
nf1 | nf2 | cf3.1 cf3.2 cf3.3 | cf 4.1 cf 4.2
Problem
Ein weiteres Problem war das man bei der Zusammensetzung mit np.concatenate den Fehler all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 3 and the array at index 1 has size 1 bekommen hat.
Dies lies sich mit dem axis=1 parameter lösen.
# One hot encode all categoric values
antrieb = lb.fit_transform(cars[categoric_values[0]])
classes_antrieb = lb.classes_
kraftstoffart = lb.fit_transform(cars[categoric_values[1]])
classes_kraftstoffart = lb.classes_
ksta = lb.fit_transform(cars[categoric_values[2]])
classes_ksta = lb.classes_
# Get all feature names in correct order
feature_names = np.concatenate([classes_antrieb, classes_kraftstoffart, classes_ksta])
print(feature_names)
# Concat all categoric feature in an array with shape (1, 23) like schema above
categoric_features_one_hot = np.concatenate([antrieb, kraftstoffart, ksta], axis=1)
['A ' 'FA ' 'HA ' '06' '07' '08' '09' '10' '25' '26' 'B1' 'BN' 'BS' 'D ' 'E ' 'S ' 'SP' 'BI_CNG ->B' 'BI_LPG ->B' 'HYBRID ->B' 'HYBRID ->D' 'STANDARD ->B' 'STANDARD ->D']
Hier haben wir zunächst alle numerischen Features in eine array zusammengefügt und mit transpose() die axis getauscht, damit die shapes wieder zusammenpassen. Wenn man dies nicht machen würde kommen wieder ähnliche Fehler wie beim obigen np.concatenate().
Zum Schluss fügen wir die numerischen Features zu den kategorischen Features zusammen.
# Fill all numeric values in an array and concat them with categoric features
numeric_values = []
for column in numeric_features:
numeric_values.append(cars[column])
feature_names = np.append(feature_names, column)
# Change axis
transposed_numeric_values = np.array(numeric_values).transpose()
# Concatenate categoric features with numeric features
X = np.concatenate([categoric_features_one_hot, transposed_numeric_values], axis=1)
# Label encode goal variable
y = le.fit_transform(cars["Produktgruppe"])
print(y)
[21 21 21 ... 22 22 22]
Benutzen Sie die scikit-learn Methode train_test_split() um X und y in einer Trainings- und Testpartition aufzuteilen. 30% der Daten soll für das Testen, 70% für das Training benutzt werden.
Für die generierung der Trainings und Testdaten gibt es die Helfer-Methode train_test_split() mit der man die Daten einfach aufteilen kann. Dort kann man direkt beide X und y als Daten eingeben und mit den Prozentangaben splitten. Der random state ist auf 0 gesetzt, damit die Daten vor dem Aufteilen nicht durchgemischt werden. Dazu haben wir uns entschieden, damit wir jedes mal die gleichen Test-und Trainingsdaten haben und somit nicht allzu unterschiedliche Ergebnisse aus den Entscheidungsbäumen danach bekommen. Das macht das Beschreiben einfacher.
# Splits data into 30% test and 70% training data and dont get shuffled
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7, test_size=0.3, random_state=0)
print("Anzahl X Trainingsdaten", len(X_train))
print("Anzahl X Testdaten", len(X_test))
print("Anzahl y Trainingsdaten", len(y_train))
print("Anzahl y Testdaten", len(y_test))
Anzahl X Trainingsdaten 16935 Anzahl X Testdaten 7259 Anzahl y Trainingsdaten 16935 Anzahl y Testdaten 7259
Evaluieren Sie die Qualität des Entscheidungsbaumes indem Sie
Interpretieren Sie das Ergebnis.
Führen Sie eine 10-fache Kreuzvalidierung des Entscheidungsbaumes mit den Daten X und y aus. Interpretieren Sie das Ergebnis.
feature_importance_ abfragen. Stellen Sie die Werte in einem Barplot dar.# Create Tree classifer object
decision_tree = DecisionTreeClassifier()
random_forest = RandomForestClassifier()
# Train Decision Tree
decision_tree = decision_tree.fit(X_train, y_train)
# Predict the response for test data
decision_tree_y_pred = decision_tree.predict(X_test)
Nun geben wir den Report aus. Anhand der target_names können wir sehen, mit welcher Genauigkeit die einzelnen Klassen vorhersehbar sind. Der Support gibt an wie viele Testdaten vorhanden waren. Überraschenderweise ist trotz der geringen Testfälle die Mittelklasse vorhersage, sehr genau. Wogegen die Luxuscoupe Klasse eher schlecht vorhersehbar war. Im Schnitt liegen der F1 Score und die Genauigkeit in Abhängigkeit vom Support, bei einem Wert von 83%. Recall beschreibt wie viele der z.B t5-Klasse Pkw auch als solche klassifiziert wurden. Precision beschreibt wie viele der als T5-Klasse Pkw klassifizierten auch wirklich dieser Klasse angehörten. Bei Kleintransporter Pkws war nur ungefähr die Hälfte der klassifizierten Fahrzeuge auch tatsächlich ein Kleintransporter. Jeder von uns hatte leicht unterschiedliche Werte. Oft gab es eine Abweichung von 1% bis 3%.
Quelle: https://vidyasheela.com/what-is-f1-score-and-what-is-its-importance-in-machine-learning/
# Create classification report
report = classification_report(y_test, decision_tree_y_pred, target_names = cars["Produktgruppe"].unique())
print(report)
precision recall f1-score support
T5-Klasse Pkw 0.81 0.77 0.79 82
Cabrio ab Mittelklasse 0.93 0.91 0.92 106
Cabrio bis Kompaktklasse 0.84 0.86 0.85 138
Minicars 0.79 0.81 0.80 32
Kompakt-SUV / Geländewagen 0.95 0.89 0.92 66
Sportcabrio / Targa 0.53 0.54 0.54 48
Luxusklasse Cabrio 0.91 0.90 0.90 433
Coupé ab Mittelklasse 0.69 0.68 0.69 587
Sportcoupé 0.67 0.70 0.68 505
Luxusklasse Coupé 0.57 0.89 0.70 9
Coupé bis Kompaktklasse 0.75 0.48 0.59 25
untere Mittelklasse / Kompaktklasse 0.84 0.84 0.84 105
Kleinwagen 0.50 0.49 0.49 37
Sprinter-Klasse 0.99 0.97 0.98 302
T5-Klasse Lkw 0.89 0.88 0.89 1207
Kleintransporter / Lkw 0.80 0.90 0.85 103
Mittelklasse 0.82 0.88 0.85 16
obere Mittelklasse 0.71 0.68 0.70 76
Kleintransporter / Pkw 0.53 0.58 0.55 53
Luxusklasse Limousine / Kombi 0.98 1.00 0.99 163
Motorcaravan 0.87 0.88 0.87 162
Pickup 0.86 0.82 0.84 233
große SUV / Geländewagen 0.84 0.74 0.79 90
kleine SUV / Geländewagen 0.90 0.89 0.89 261
mittlere SUV / Geländewagen 0.72 0.70 0.71 176
Kompaktvan 0.75 0.73 0.74 590
Microvan / Minivan 0.81 0.85 0.83 332
Van 0.87 0.88 0.88 1322
accuracy 0.83 7259
macro avg 0.79 0.79 0.79 7259
weighted avg 0.83 0.83 0.83 7259
Die Funktion plot_confusion_matrix() ist veraltet und wird in einer späteren Version entfernt. Deshalb haben wir uns für die alternative confusion_matrix() mit ConfusionMatrixDisplay() entschieden.
Die ConfusionMatrix zeigt die jeweilige Einordnung der Klassen und was sie wirklich waren. Anhand der Heatmap kann man sehen dass Vans die meisten richtig klassifizierten Daten hatte. Das liegt vorallem daran, dass es die meisten Verfügbaren Daten hatte. Außerdem lässt sich die diagonale Linie erkennen, die man wir möchten. Diese bedeutet dass viele Klassen richtig Klassifiziert wurden. Man erkennt auch die Ausschweifer und ihre "Spiegelung" an der Diagonalen. Ähnliche Klassen werden ähnlich oft gegenseitig falsch klassifiziert.
# Create confusion matrix
# plot_confusion_matrix(decision_tree, X_test, y_test) #deprecated
# plt.show()
decision_tree_cm = confusion_matrix(y_true=y_test, y_pred=decision_tree_y_pred)
decision_tree_cmd = ConfusionMatrixDisplay(decision_tree_cm)
figure, axes = plt.subplots(figsize=(19, 14))
axes.grid(False)
decision_tree_cmd.plot(ax=axes)
<sklearn.metrics._plot.confusion_matrix.ConfusionMatrixDisplay at 0x18582686830>
Die cross_val_score Funktion, partitioniert die Daten jeweils unterschiedlich in Testdatenpartitionen und prüft dann die Genauigkeit. Die Ergebnisse sind die Accuracy pro Partition. Das heißt unser Baum ist mit unterschiedlichen Testdaten unterschiedlich genau. Er bewegt sich bei einer Accuracy von 66% bis 78%.
# 10x cross validation
decision_tree_score = cross_val_score(decision_tree, X, y, cv=10)
print(decision_tree_score)
[0.68553719 0.69090909 0.73099174 0.69710744 0.7399752 0.68747416 0.7465895 0.76849938 0.77635387 0.73749483]
Bei der Feature Importance kann man erkennen, dass vorallem 3 Feature entscheidend für die Klassifizierung sind. Diese sind: Länge, Höhe und Zulässiges GG.
# Feature importances
print(decision_tree.feature_importances_)
plt.bar(np.arange(len(decision_tree.feature_importances_)), decision_tree.feature_importances_)
plt.xticks(np.arange(len(decision_tree.feature_importances_)), feature_names, rotation='vertical', fontsize = 'x-small')
plt.show()
[1.14405815e-02 3.69375471e-03 9.96229299e-03 0.00000000e+00 0.00000000e+00 2.27601879e-04 0.00000000e+00 2.15068813e-04 1.21280154e-04 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.27558383e-03 1.37889287e-03 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.37764422e-03 5.05091106e-04 1.10967606e-03 9.28716305e-04 8.11251508e-02 2.82621021e-02 1.66702523e-02 3.51830620e-02 7.72736530e-02 5.25824534e-02 6.59646920e-02 1.75490533e-01 2.08972164e-01 3.64173649e-02 1.57306787e-01 3.25156011e-02]
Zusätzlich zu den gewünschten Plots, haben wir uns auch den Decision Tree ausgeben lassen. Man erkennt sofort dass die Struktur durch die vielen Features, unheimlich komplex ist. Der erste Plot ist stellt den Baum in Textform dar, der zweite macht in graphisch.
text_representation = tree.export_text(decision_tree)
print(text_representation)
|--- feature_33 <= 1537.50 | |--- feature_30 <= 2027.89 | | |--- feature_31 <= 4164.50 | | | |--- feature_31 <= 3778.00 | | | | |--- feature_27 <= 2.50 | | | | | |--- feature_29 <= 266.50 | | | | | | |--- feature_23 <= 28710.00 | | | | | | | |--- class: 13 | | | | | | |--- feature_23 > 28710.00 | | | | | | | |--- class: 17 | | | | | |--- feature_29 > 266.50 | | | | | | |--- class: 1 | | | | |--- feature_27 > 2.50 | | | | | |--- feature_30 <= 1559.90 | | | | | | |--- feature_31 <= 3696.00 | | | | | | | |--- feature_23 <= 11042.50 | | | | | | | | |--- feature_23 <= 11008.50 | | | | | | | | | |--- feature_31 <= 3498.00 | | | | | | | | | | |--- feature_31 <= 3496.00 | | | | | | | | | | | |--- class: 13 | | | | | | | | | | |--- feature_31 > 3496.00 | | | | | | | | | | | |--- class: 4 | | | | | | | | | |--- feature_31 > 3498.00 | | | | | | | | | | |--- class: 13 | | | | | | | | |--- feature_23 > 11008.50 | | | | | | | | | |--- class: 6 | | | | | | | |--- feature_23 > 11042.50 | | | | | | | | |--- class: 13 | | | | | | |--- feature_31 > 3696.00 | | | | | | | |--- feature_28 <= 942.50 | | | | | | | | |--- feature_33 <= 1509.00 | | | | | | | | | |--- feature_29 <= 372.00 | | | | | | | | | | |--- class: 6 | | | | | | | | | |--- feature_29 > 372.00 | | | | | | | | | | |--- class: 13 | | | | | | | | |--- feature_33 > 1509.00 | | | | | | | | | |--- class: 6 | | | | | | | |--- feature_28 > 942.50 | | | | | | | | |--- class: 13 | | | | | |--- feature_30 > 1559.90 | | | | | | |--- feature_25 <= 104.50 | | | | | | | |--- class: 6 | | | | | | |--- feature_25 > 104.50 | | | | | | | |--- class: 13 | | | |--- feature_31 > 3778.00 | | | | |--- feature_27 <= 2.50 | | | | | |--- feature_33 <= 1373.50 | | | | | | |--- feature_29 <= 322.50 | | | | | | | |--- feature_34 <= 216.50 | | | | | | | | |--- class: 17 | | | | | | | |--- feature_34 > 216.50 | | | | | | | | |--- class: 18 | | | | | | |--- feature_29 > 322.50 | | | | | | | |--- feature_31 <= 4050.00 | | | | | | | | |--- class: 17 | | | | | | | |--- feature_31 > 4050.00 | | | | | | | | |--- class: 18 | | | | | |--- feature_33 > 1373.50 | | | | | | |--- feature_32 <= 1788.00 | | | | | | | |--- feature_29 <= 278.00 | | | | | | | | |--- class: 17 | | | | | | | |--- feature_29 > 278.00 | | | | | | | | |--- class: 1 | | | | | | |--- feature_32 > 1788.00 | | | | | | | |--- class: 18 | | | | |--- feature_27 > 2.50 | | | | | |--- feature_30 <= 1794.22 | | | | | | |--- feature_33 <= 1509.50 | | | | | | | |--- feature_31 <= 4151.50 | | | | | | | | |--- feature_32 <= 1873.00 | | | | | | | | | |--- feature_23 <= 36424.50 | | | | | | | | | | |--- feature_29 <= 344.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_29 > 344.00 | | | | | | | | | | | |--- truncated branch of depth 12 | | | | | | | | | |--- feature_23 > 36424.50 | | | | | | | | | | |--- class: 24 | | | | | | | | |--- feature_32 > 1873.00 | | | | | | | | | |--- class: 7 | | | | | | | |--- feature_31 > 4151.50 | | | | | | | | |--- feature_32 <= 1782.00 | | | | | | | | | |--- feature_29 <= 477.00 | | | | | | | | | | |--- class: 6 | | | | | | | | | |--- feature_29 > 477.00 | | | | | | | | | | |--- feature_26 <= 95.50 | | | | | | | | | | | |--- class: 6 | | | | | | | | | | |--- feature_26 > 95.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- feature_32 > 1782.00 | | | | | | | | | |--- feature_30 <= 1706.52 | | | | | | | | | | |--- class: 24 | | | | | | | | | |--- feature_30 > 1706.52 | | | | | | | | | | |--- feature_34 <= 96.00 | | | | | | | | | | | |--- class: 12 | | | | | | | | | | |--- feature_34 > 96.00 | | | | | | | | | | | |--- class: 27 | | | | | | |--- feature_33 > 1509.50 | | | | | | | |--- feature_23 <= 18702.00 | | | | | | | | |--- feature_34 <= 153.50 | | | | | | | | | |--- feature_31 <= 3790.00 | | | | | | | | | | |--- class: 13 | | | | | | | | | |--- feature_31 > 3790.00 | | | | | | | | | | |--- feature_23 <= 18047.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_23 > 18047.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- feature_34 > 153.50 | | | | | | | | | |--- class: 27 | | | | | | | |--- feature_23 > 18702.00 | | | | | | | | |--- feature_31 <= 4058.00 | | | | | | | | | |--- feature_28 <= 1303.50 | | | | | | | | | | |--- feature_32 <= 1650.50 | | | | | | | | | | | |--- class: 12 | | | | | | | | | | |--- feature_32 > 1650.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- feature_28 > 1303.50 | | | | | | | | | | |--- class: 24 | | | | | | | | |--- feature_31 > 4058.00 | | | | | | | | | |--- feature_34 <= 131.50 | | | | | | | | | | |--- feature_34 <= 97.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_34 > 97.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- feature_34 > 131.50 | | | | | | | | | | |--- feature_26 <= 113.00 | | | | | | | | | | | |--- class: 6 | | | | | | | | | | |--- feature_26 > 113.00 | | | | | | | | | | | |--- class: 12 | | | | | |--- feature_30 > 1794.22 | | | | | | |--- feature_33 <= 1489.50 | | | | | | | |--- feature_31 <= 4040.00 | | | | | | | | |--- class: 6 | | | | | | | |--- feature_31 > 4040.00 | | | | | | | | |--- feature_29 <= 457.50 | | | | | | | | | |--- feature_32 <= 1753.00 | | | | | | | | | | |--- class: 6 | | | | | | | | | |--- feature_32 > 1753.00 | | | | | | | | | | |--- class: 27 | | | | | | | | |--- feature_29 > 457.50 | | | | | | | | | |--- feature_24 <= 1117.00 | | | | | | | | | | |--- class: 7 | | | | | | | | | |--- feature_24 > 1117.00 | | | | | | | | | | |--- class: 27 | | | | | | |--- feature_33 > 1489.50 | | | | | | | |--- feature_29 <= 483.00 | | | | | | | | |--- feature_32 <= 1758.00 | | | | | | | | | |--- class: 27 | | | | | | | | |--- feature_32 > 1758.00 | | | | | | | | | |--- class: 24 | | | | | | | |--- feature_29 > 483.00 | | | | | | | | |--- feature_1 <= 0.50 | | | | | | | | | |--- feature_32 <= 1788.50 | | | | | | | | | | |--- class: 27 | | | | | | | | | |--- feature_32 > 1788.50 | | | | | | | | | | |--- class: 7 | | | | | | | | |--- feature_1 > 0.50 | | | | | | | | | |--- class: 7 | | |--- feature_31 > 4164.50 | | | |--- feature_27 <= 2.50 | | | | |--- feature_33 <= 1344.50 | | | | | |--- feature_34 <= 161.00 | | | | | | |--- feature_29 <= 346.00 | | | | | | | |--- feature_29 <= 270.00 | | | | | | | | |--- class: 18 | | | | | | | |--- feature_29 > 270.00 | | | | | | | | |--- class: 17 | | | | | | |--- feature_29 > 346.00 | | | | | | | |--- feature_30 <= 1768.36 | | | | | | | | |--- feature_1 <= 0.50 | | | | | | | | | |--- class: 0 | | | | | | | | |--- feature_1 > 0.50 | | | | | | | | | |--- class: 18 | | | | | | | |--- feature_30 > 1768.36 | | | | | | | | |--- feature_5 <= 0.50 | | | | | | | | | |--- class: 2 | | | | | | | | |--- feature_5 > 0.50 | | | | | | | | | |--- class: 18 | | | | | |--- feature_34 > 161.00 | | | | | | |--- feature_26 <= 287.50 | | | | | | | |--- class: 18 | | | | | | |--- feature_26 > 287.50 | | | | | | | |--- feature_24 <= 4005.50 | | | | | | | | |--- feature_28 <= 1408.00 | | | | | | | | | |--- class: 17 | | | | | | | | |--- feature_28 > 1408.00 | | | | | | | | | |--- feature_23 <= 116225.50 | | | | | | | | | | |--- feature_26 <= 297.50 | | | | | | | | | | | |--- class: 17 | | | | | | | | | | |--- feature_26 > 297.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | |--- feature_23 > 116225.50 | | | | | | | | | | |--- feature_26 <= 607.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- feature_26 > 607.50 | | | | | | | | | | | |--- class: 18 | | | | | | | |--- feature_24 > 4005.50 | | | | | | | | |--- feature_33 <= 1195.00 | | | | | | | | | |--- class: 17 | | | | | | | | |--- feature_33 > 1195.00 | | | | | | | | | |--- class: 18 | | | | |--- feature_33 > 1344.50 | | | | | |--- feature_31 <= 4582.00 | | | | | | |--- feature_29 <= 483.00 | | | | | | | |--- feature_33 <= 1375.50 | | | | | | | | |--- feature_28 <= 1392.50 | | | | | | | | | |--- class: 18 | | | | | | | | |--- feature_28 > 1392.50 | | | | | | | | | |--- feature_29 <= 329.50 | | | | | | | | | | |--- class: 17 | | | | | | | | | |--- feature_29 > 329.50 | | | | | | | | | | |--- feature_31 <= 4496.00 | | | | | | | | | | | |--- class: 1 | | | | | | | | | | |--- feature_31 > 4496.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | |--- feature_33 > 1375.50 | | | | | | | | |--- feature_33 <= 1502.50 | | | | | | | | | |--- feature_34 <= 112.50 | | | | | | | | | | |--- class: 2 | | | | | | | | | |--- feature_34 > 112.50 | | | | | | | | | | |--- feature_25 <= 245.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_25 > 245.50 | | | | | | | | | | | |--- class: 3 | | | | | | | | |--- feature_33 > 1502.50 | | | | | | | | | |--- feature_28 <= 1507.00 | | | | | | | | | | |--- class: 1 | | | | | | | | | |--- feature_28 > 1507.00 | | | | | | | | | | |--- feature_32 <= 1867.50 | | | | | | | | | | | |--- class: 7 | | | | | | | | | | |--- feature_32 > 1867.50 | | | | | | | | | | | |--- class: 1 | | | | | | |--- feature_29 > 483.00 | | | | | | | |--- feature_32 <= 1835.00 | | | | | | | | |--- feature_29 <= 488.50 | | | | | | | | | |--- feature_21 <= 0.50 | | | | | | | | | | |--- class: 3 | | | | | | | | | |--- feature_21 > 0.50 | | | | | | | | | | |--- class: 2 | | | | | | | | |--- feature_29 > 488.50 | | | | | | | | | |--- feature_32 <= 1822.50 | | | | | | | | | | |--- class: 3 | | | | | | | | | |--- feature_32 > 1822.50 | | | | | | | | | | |--- feature_32 <= 1827.50 | | | | | | | | | | | |--- class: 2 | | | | | | | | | | |--- feature_32 > 1827.50 | | | | | | | | | | | |--- class: 3 | | | | | | | |--- feature_32 > 1835.00 | | | | | | | | |--- feature_29 <= 560.00 | | | | | | | | | |--- class: 2 | | | | | | | | |--- feature_29 > 560.00 | | | | | | | | | |--- feature_12 <= 0.50 | | | | | | | | | | |--- class: 1 | | | | | | | | | |--- feature_12 > 0.50 | | | | | | | | | | |--- class: 0 | | | | | |--- feature_31 > 4582.00 | | | | | | |--- feature_29 <= 555.00 | | | | | | | |--- feature_29 <= 398.50 | | | | | | | | |--- class: 18 | | | | | | | |--- feature_29 > 398.50 | | | | | | | | |--- feature_28 <= 1967.00 | | | | | | | | | |--- feature_30 <= 1893.46 | | | | | | | | | | |--- feature_34 <= 125.50 | | | | | | | | | | | |--- class: 3 | | | | | | | | | | |--- feature_34 > 125.50 | | | | | | | | | | | |--- class: 2 | | | | | | | | | |--- feature_30 > 1893.46 | | | | | | | | | | |--- class: 2 | | | | | | | | |--- feature_28 > 1967.00 | | | | | | | | | |--- class: 0 | | | | | | |--- feature_29 > 555.00 | | | | | | | |--- class: 0 | | | |--- feature_27 > 2.50 | | | | |--- feature_31 <= 4725.50 | | | | | |--- feature_28 <= 1173.50 | | | | | | |--- feature_31 <= 4304.50 | | | | | | | |--- feature_24 <= 1160.50 | | | | | | | | |--- feature_25 <= 83.50 | | | | | | | | | |--- feature_31 <= 4270.50 | | | | | | | | | | |--- feature_30 <= 1682.06 | | | | | | | | | | | |--- class: 6 | | | | | | | | | | |--- feature_30 > 1682.06 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- feature_31 > 4270.50 | | | | | | | | | | |--- feature_31 <= 4280.50 | | | | | | | | | | | |--- class: 27 | | | | | | | | | | |--- feature_31 > 4280.50 | | | | | | | | | | | |--- class: 6 | | | | | | | | |--- feature_25 > 83.50 | | | | | | | | | |--- feature_33 <= 1512.50 | | | | | | | | | | |--- feature_30 <= 1645.70 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_30 > 1645.70 | | | | | | | | | | | |--- class: 6 | | | | | | | | | |--- feature_33 > 1512.50 | | | | | | | | | | |--- class: 24 | | | | | | | |--- feature_24 > 1160.50 | | | | | | | | |--- feature_29 <= 467.00 | | | | | | | | | |--- feature_24 <= 1403.50 | | | | | | | | | | |--- feature_30 <= 1621.61 | | | | | | | | | | | |--- class: 24 | | | | | | | | | | |--- feature_30 > 1621.61 | | | | | | | | | | | |--- class: 6 | | | | | | | | | |--- feature_24 > 1403.50 | | | | | | | | | | |--- feature_29 <= 456.00 | | | | | | | | | | | |--- class: 6 | | | | | | | | | | |--- feature_29 > 456.00 | | | | | | | | | | | |--- class: 24 | | | | | | | | |--- feature_29 > 467.00 | | | | | | | | | |--- feature_28 <= 1106.00 | | | | | | | | | | |--- class: 27 | | | | | | | | | |--- feature_28 > 1106.00 | | | | | | | | | | |--- feature_23 <= 23338.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_23 > 23338.50 | | | | | | | | | | | |--- class: 27 | | | | | | |--- feature_31 > 4304.50 | | | | | | | |--- feature_34 <= 104.50 | | | | | | | | |--- feature_30 <= 1613.23 | | | | | | | | | |--- class: 6 | | | | | | | | |--- feature_30 > 1613.23 | | | | | | | | | |--- feature_28 <= 1142.50 | | | | | | | | | | |--- class: 6 | | | | | | | | | |--- feature_28 > 1142.50 | | | | | | | | | | |--- class: 27 | | | | | | | |--- feature_34 > 104.50 | | | | | | | | |--- feature_30 <= 1591.58 | | | | | | | | | |--- feature_30 <= 1550.64 | | | | | | | | | | |--- class: 27 | | | | | | | | | |--- feature_30 > 1550.64 | | | | | | | | | | |--- class: 6 | | | | | | | | |--- feature_30 > 1591.58 | | | | | | | | | |--- class: 27 | | | | | |--- feature_28 > 1173.50 | | | | | | |--- feature_23 <= 43238.00 | | | | | | | |--- feature_33 <= 1508.50 | | | | | | | | |--- feature_23 <= 34465.50 | | | | | | | | | |--- feature_30 <= 1739.82 | | | | | | | | | | |--- feature_31 <= 4273.50 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- feature_31 > 4273.50 | | | | | | | | | | | |--- truncated branch of depth 12 | | | | | | | | | |--- feature_30 > 1739.82 | | | | | | | | | | |--- feature_32 <= 1935.00 | | | | | | | | | | | |--- truncated branch of depth 14 | | | | | | | | | | |--- feature_32 > 1935.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | |--- feature_23 > 34465.50 | | | | | | | | | |--- feature_31 <= 4537.50 | | | | | | | | | | |--- feature_33 <= 1484.50 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | | |--- feature_33 > 1484.50 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | |--- feature_31 > 4537.50 | | | | | | | | | | |--- feature_2 <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | | | |--- feature_2 > 0.50 | | | | | | | | | | | |--- class: 14 | | | | | | | |--- feature_33 > 1508.50 | | | | | | | | |--- feature_23 <= 29555.50 | | | | | | | | | |--- feature_31 <= 4317.00 | | | | | | | | | | |--- feature_29 <= 494.00 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- feature_29 > 494.00 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | |--- feature_31 > 4317.00 | | | | | | | | | | |--- feature_19 <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | | |--- feature_19 > 0.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- feature_23 > 29555.50 | | | | | | | | | |--- feature_31 <= 4590.50 | | | | | | | | | | |--- feature_29 <= 503.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- feature_29 > 503.50 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | |--- feature_31 > 4590.50 | | | | | | | | | | |--- feature_28 <= 1431.00 | | | | | | | | | | | |--- class: 27 | | | | | | | | | | |--- feature_28 > 1431.00 | | | | | | | | | | | |--- class: 14 | | | | | | |--- feature_23 > 43238.00 | | | | | | | |--- feature_34 <= 150.50 | | | | | | | | |--- feature_29 <= 538.50 | | | | | | | | | |--- feature_29 <= 510.50 | | | | | | | | | | |--- feature_25 <= 126.50 | | | | | | | | | | | |--- class: 27 | | | | | | | | | | |--- feature_25 > 126.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- feature_29 > 510.50 | | | | | | | | | | |--- class: 27 | | | | | | | | |--- feature_29 > 538.50 | | | | | | | | | |--- feature_1 <= 0.50 | | | | | | | | | | |--- feature_31 <= 4431.00 | | | | | | | | | | | |--- class: 27 | | | | | | | | | | |--- feature_31 > 4431.00 | | | | | | | | | | | |--- class: 14 | | | | | | | | | |--- feature_1 > 0.50 | | | | | | | | | | |--- feature_26 <= 189.50 | | | | | | | | | | | |--- class: 27 | | | | | | | | | | |--- feature_26 > 189.50 | | | | | | | | | | | |--- class: 7 | | | | | | | |--- feature_34 > 150.50 | | | | | | | | |--- feature_32 <= 1869.00 | | | | | | | | | |--- feature_32 <= 1821.00 | | | | | | | | | | |--- feature_13 <= 0.50 | | | | | | | | | | | |--- class: 27 | | | | | | | | | | |--- feature_13 > 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- feature_32 > 1821.00 | | | | | | | | | | |--- class: 14 | | | | | | | | |--- feature_32 > 1869.00 | | | | | | | | | |--- class: 18 | | | | |--- feature_31 > 4725.50 | | | | | |--- feature_23 <= 29134.50 | | | | | | |--- feature_31 <= 4905.50 | | | | | | | |--- feature_29 <= 616.50 | | | | | | | | |--- feature_28 <= 1478.50 | | | | | | | | | |--- feature_33 <= 1404.00 | | | | | | | | | | |--- feature_32 <= 1826.00 | | | | | | | | | | | |--- class: 27 | | | | | | | | | | |--- feature_32 > 1826.00 | | | | | | | | | | | |--- class: 14 | | | | | | | | | |--- feature_33 > 1404.00 | | | | | | | | | | |--- class: 27 | | | | | | | | |--- feature_28 > 1478.50 | | | | | | | | | |--- feature_24 <= 1579.50 | | | | | | | | | | |--- feature_23 <= 21868.00 | | | | | | | | | | | |--- class: 8 | | | | | | | | | | |--- feature_23 > 21868.00 | | | | | | | | | | | |--- class: 27 | | | | | | | | | |--- feature_24 > 1579.50 | | | | | | | | | | |--- class: 14 | | | | | | | |--- feature_29 > 616.50 | | | | | | | | |--- feature_24 <= 1441.00 | | | | | | | | | |--- feature_34 <= 110.00 | | | | | | | | | | |--- class: 27 | | | | | | | | | |--- feature_34 > 110.00 | | | | | | | | | | |--- class: 14 | | | | | | | | |--- feature_24 > 1441.00 | | | | | | | | | |--- class: 27 | | | | | | |--- feature_31 > 4905.50 | | | | | | | |--- class: 14 | | | | | |--- feature_23 > 29134.50 | | | | | | |--- feature_24 <= 1354.50 | | | | | | | |--- class: 27 | | | | | | |--- feature_24 > 1354.50 | | | | | | | |--- feature_32 <= 1788.50 | | | | | | | | |--- feature_2 <= 0.50 | | | | | | | | | |--- feature_31 <= 4826.50 | | | | | | | | | | |--- feature_34 <= 102.00 | | | | | | | | | | | |--- class: 14 | | | | | | | | | | |--- feature_34 > 102.00 | | | | | | | | | | | |--- class: 27 | | | | | | | | | |--- feature_31 > 4826.50 | | | | | | | | | | |--- feature_23 <= 36828.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_23 > 36828.00 | | | | | | | | | | | |--- class: 14 | | | | | | | | |--- feature_2 > 0.50 | | | | | | | | | |--- class: 14 | | | | | | | |--- feature_32 > 1788.50 | | | | | | | | |--- feature_23 <= 50042.00 | | | | | | | | | |--- feature_31 <= 4807.50 | | | | | | | | | | |--- feature_23 <= 33863.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- feature_23 > 33863.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | |--- feature_31 > 4807.50 | | | | | | | | | | |--- feature_25 <= 180.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- feature_25 > 180.00 | | | | | | | | | | | |--- class: 27 | | | | | | | | |--- feature_23 > 50042.00 | | | | | | | | | |--- feature_27 <= 4.50 | | | | | | | | | | |--- class: 26 | | | | | | | | | |--- feature_27 > 4.50 | | | | | | | | | | |--- class: 14 | |--- feature_30 > 2027.89 | | |--- feature_30 <= 2323.50 | | | |--- feature_27 <= 3.50 | | | | |--- feature_29 <= 479.50 | | | | | |--- feature_29 <= 411.50 | | | | | | |--- feature_25 <= 152.00 | | | | | | | |--- feature_20 <= 0.50 | | | | | | | | |--- class: 1 | | | | | | | |--- feature_20 > 0.50 | | | | | | | | |--- class: 0 | | | | | | |--- feature_25 > 152.00 | | | | | | | |--- feature_29 <= 318.50 | | | | | | | | |--- feature_23 <= 87104.50 | | | | | | | | | |--- feature_33 <= 1401.00 | | | | | | | | | | |--- feature_25 <= 257.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_25 > 257.50 | | | | | | | | | | | |--- class: 18 | | | | | | | | | |--- feature_33 > 1401.00 | | | | | | | | | | |--- class: 17 | | | | | | | | |--- feature_23 > 87104.50 | | | | | | | | | |--- feature_31 <= 4557.00 | | | | | | | | | | |--- class: 17 | | | | | | | | | |--- feature_31 > 4557.00 | | | | | | | | | | |--- feature_23 <= 136040.00 | | | | | | | | | | | |--- class: 18 | | | | | | | | | | |--- feature_23 > 136040.00 | | | | | | | | | | | |--- class: 17 | | | | | | | |--- feature_29 > 318.50 | | | | | | | | |--- feature_23 <= 111511.00 | | | | | | | | | |--- feature_28 <= 1900.00 | | | | | | | | | | |--- feature_32 <= 1858.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- feature_32 > 1858.00 | | | | | | | | | | | |--- class: 18 | | | | | | | | | |--- feature_28 > 1900.00 | | | | | | | | | | |--- class: 0 | | | | | | | | |--- feature_23 > 111511.00 | | | | | | | | | |--- feature_26 <= 565.00 | | | | | | | | | | |--- feature_24 <= 2920.50 | | | | | | | | | | | |--- class: 18 | | | | | | | | | | |--- feature_24 > 2920.50 | | | | | | | | | | | |--- class: 17 | | | | | | | | | |--- feature_26 > 565.00 | | | | | | | | | | |--- class: 18 | | | | | |--- feature_29 > 411.50 | | | | | | |--- feature_31 <= 4504.00 | | | | | | | |--- feature_23 <= 54390.50 | | | | | | | | |--- feature_26 <= 134.50 | | | | | | | | | |--- class: 27 | | | | | | | | |--- feature_26 > 134.50 | | | | | | | | | |--- class: 1 | | | | | | | |--- feature_23 > 54390.50 | | | | | | | | |--- feature_28 <= 1690.50 | | | | | | | | | |--- class: 17 | | | | | | | | |--- feature_28 > 1690.50 | | | | | | | | | |--- feature_33 <= 1400.00 | | | | | | | | | | |--- class: 2 | | | | | | | | | |--- feature_33 > 1400.00 | | | | | | | | | | |--- feature_26 <= 291.00 | | | | | | | | | | | |--- class: 18 | | | | | | | | | | |--- feature_26 > 291.00 | | | | | | | | | | | |--- class: 1 | | | | | | |--- feature_31 > 4504.00 | | | | | | | |--- feature_30 <= 2121.65 | | | | | | | | |--- feature_26 <= 189.50 | | | | | | | | | |--- feature_21 <= 0.50 | | | | | | | | | | |--- feature_23 <= 48199.00 | | | | | | | | | | | |--- class: 2 | | | | | | | | | | |--- feature_23 > 48199.00 | | | | | | | | | | | |--- class: 0 | | | | | | | | | |--- feature_21 > 0.50 | | | | | | | | | | |--- feature_32 <= 1767.00 | | | | | | | | | | | |--- class: 1 | | | | | | | | | | |--- feature_32 > 1767.00 | | | | | | | | | | | |--- class: 0 | | | | | | | | |--- feature_26 > 189.50 | | | | | | | | | |--- feature_24 <= 3044.50 | | | | | | | | | | |--- feature_33 <= 1435.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- feature_33 > 1435.00 | | | | | | | | | | | |--- class: 1 | | | | | | | | | |--- feature_24 > 3044.50 | | | | | | | | | | |--- feature_33 <= 1360.00 | | | | | | | | | | | |--- class: 17 | | | | | | | | | | |--- feature_33 > 1360.00 | | | | | | | | | | | |--- class: 1 | | | | | | | |--- feature_30 > 2121.65 | | | | | | | | |--- feature_28 <= 1767.00 | | | | | | | | | |--- feature_24 <= 2158.00 | | | | | | | | | | |--- feature_34 <= 165.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- feature_34 > 165.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- feature_24 > 2158.00 | | | | | | | | | | |--- feature_29 <= 474.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_29 > 474.00 | | | | | | | | | | | |--- class: 0 | | | | | | | | |--- feature_28 > 1767.00 | | | | | | | | | |--- feature_25 <= 335.00 | | | | | | | | | | |--- feature_23 <= 45929.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_23 > 45929.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- feature_25 > 335.00 | | | | | | | | | | |--- feature_0 <= 0.50 | | | | | | | | | | | |--- class: 18 | | | | | | | | | | |--- feature_0 > 0.50 | | | | | | | | | | | |--- class: 2 | | | | |--- feature_29 > 479.50 | | | | | |--- feature_27 <= 2.50 | | | | | | |--- feature_29 <= 567.00 | | | | | | | |--- feature_31 <= 4479.50 | | | | | | | | |--- feature_33 <= 1399.50 | | | | | | | | | |--- class: 2 | | | | | | | | |--- feature_33 > 1399.50 | | | | | | | | | |--- class: 3 | | | | | | | |--- feature_31 > 4479.50 | | | | | | | | |--- feature_28 <= 1803.00 | | | | | | | | | |--- feature_32 <= 1894.00 | | | | | | | | | | |--- feature_33 <= 1399.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_33 > 1399.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | |--- feature_32 > 1894.00 | | | | | | | | | | |--- feature_23 <= 62677.00 | | | | | | | | | | | |--- class: 0 | | | | | | | | | | |--- feature_23 > 62677.00 | | | | | | | | | | | |--- class: 2 | | | | | | | | |--- feature_28 > 1803.00 | | | | | | | | | |--- feature_32 <= 1809.00 | | | | | | | | | | |--- class: 10 | | | | | | | | | |--- feature_32 > 1809.00 | | | | | | | | | | |--- class: 0 | | | | | | |--- feature_29 > 567.00 | | | | | | | |--- feature_28 <= 1714.50 | | | | | | | | |--- feature_33 <= 1317.00 | | | | | | | | | |--- class: 17 | | | | | | | | |--- feature_33 > 1317.00 | | | | | | | | | |--- class: 18 | | | | | | | |--- feature_28 > 1714.50 | | | | | | | | |--- class: 17 | | | | | |--- feature_27 > 2.50 | | | | | | |--- class: 27 | | | |--- feature_27 > 3.50 | | | | |--- feature_31 <= 4481.50 | | | | | |--- feature_33 <= 1487.50 | | | | | | |--- feature_30 <= 2139.78 | | | | | | | |--- feature_2 <= 0.50 | | | | | | | | |--- feature_33 <= 1467.50 | | | | | | | | | |--- feature_23 <= 45467.50 | | | | | | | | | | |--- feature_34 <= 161.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_34 > 161.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- feature_23 > 45467.50 | | | | | | | | | | |--- feature_26 <= 236.00 | | | | | | | | | | | |--- class: 14 | | | | | | | | | | |--- feature_26 > 236.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- feature_33 > 1467.50 | | | | | | | | | |--- feature_23 <= 36152.00 | | | | | | | | | | |--- class: 27 | | | | | | | | | |--- feature_23 > 36152.00 | | | | | | | | | | |--- feature_26 <= 182.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_26 > 182.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | |--- feature_2 > 0.50 | | | | | | | | |--- feature_31 <= 4390.00 | | | | | | | | | |--- class: 27 | | | | | | | | |--- feature_31 > 4390.00 | | | | | | | | | |--- feature_23 <= 38393.50 | | | | | | | | | | |--- class: 27 | | | | | | | | | |--- feature_23 > 38393.50 | | | | | | | | | | |--- class: 14 | | | | | | |--- feature_30 > 2139.78 | | | | | | | |--- feature_33 <= 1454.50 | | | | | | | | |--- feature_34 <= 38.00 | | | | | | | | | |--- class: 27 | | | | | | | | |--- feature_34 > 38.00 | | | | | | | | | |--- class: 14 | | | | | | | |--- feature_33 > 1454.50 | | | | | | | | |--- feature_25 <= 128.50 | | | | | | | | | |--- class: 14 | | | | | | | | |--- feature_25 > 128.50 | | | | | | | | | |--- feature_32 <= 1776.00 | | | | | | | | | | |--- feature_0 <= 0.50 | | | | | | | | | | | |--- class: 14 | | | | | | | | | | |--- feature_0 > 0.50 | | | | | | | | | | | |--- class: 27 | | | | | | | | | |--- feature_32 > 1776.00 | | | | | | | | | | |--- class: 7 | | | | | |--- feature_33 > 1487.50 | | | | | | |--- feature_29 <= 560.00 | | | | | | | |--- feature_29 <= 505.00 | | | | | | | | |--- feature_32 <= 1816.00 | | | | | | | | | |--- feature_31 <= 4437.50 | | | | | | | | | | |--- class: 27 | | | | | | | | | |--- feature_31 > 4437.50 | | | | | | | | | | |--- feature_31 <= 4479.50 | | | | | | | | | | | |--- class: 7 | | | | | | | | | | |--- feature_31 > 4479.50 | | | | | | | | | | | |--- class: 25 | | | | | | | | |--- feature_32 > 1816.00 | | | | | | | | | |--- feature_28 <= 1710.00 | | | | | | | | | | |--- class: 7 | | | | | | | | | |--- feature_28 > 1710.00 | | | | | | | | | | |--- class: 14 | | | | | | | |--- feature_29 > 505.00 | | | | | | | | |--- feature_23 <= 45108.50 | | | | | | | | | |--- feature_23 <= 36496.00 | | | | | | | | | | |--- feature_34 <= 131.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_34 > 131.50 | | | | | | | | | | | |--- class: 7 | | | | | | | | | |--- feature_23 > 36496.00 | | | | | | | | | | |--- feature_28 <= 1594.50 | | | | | | | | | | | |--- class: 8 | | | | | | | | | | |--- feature_28 > 1594.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- feature_23 > 45108.50 | | | | | | | | | |--- feature_13 <= 0.50 | | | | | | | | | | |--- class: 27 | | | | | | | | | |--- feature_13 > 0.50 | | | | | | | | | | |--- class: 7 | | | | | | |--- feature_29 > 560.00 | | | | | | | |--- feature_25 <= 95.50 | | | | | | | | |--- feature_28 <= 1497.50 | | | | | | | | | |--- class: 8 | | | | | | | | |--- feature_28 > 1497.50 | | | | | | | | | |--- class: 14 | | | | | | | |--- feature_25 > 95.50 | | | | | | | | |--- feature_34 <= 155.50 | | | | | | | | | |--- feature_29 <= 625.00 | | | | | | | | | | |--- class: 7 | | | | | | | | | |--- feature_29 > 625.00 | | | | | | | | | | |--- class: 27 | | | | | | | | |--- feature_34 > 155.50 | | | | | | | | | |--- class: 27 | | | | |--- feature_31 > 4481.50 | | | | | |--- feature_31 <= 4923.50 | | | | | | |--- feature_30 <= 2105.76 | | | | | | | |--- feature_2 <= 0.50 | | | | | | | | |--- feature_31 <= 4721.00 | | | | | | | | | |--- feature_32 <= 1811.00 | | | | | | | | | | |--- feature_28 <= 1636.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- feature_28 > 1636.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- feature_32 > 1811.00 | | | | | | | | | | |--- feature_28 <= 1485.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- feature_28 > 1485.00 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | |--- feature_31 > 4721.00 | | | | | | | | | |--- feature_32 <= 1812.00 | | | | | | | | | | |--- feature_28 <= 1463.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_28 > 1463.00 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | |--- feature_32 > 1812.00 | | | | | | | | | | |--- feature_28 <= 1398.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_28 > 1398.50 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | |--- feature_2 > 0.50 | | | | | | | | |--- class: 14 | | | | | | |--- feature_30 > 2105.76 | | | | | | | |--- feature_31 <= 4778.50 | | | | | | | | |--- feature_24 <= 1345.00 | | | | | | | | | |--- class: 27 | | | | | | | | |--- feature_24 > 1345.00 | | | | | | | | | |--- feature_31 <= 4529.50 | | | | | | | | | | |--- feature_33 <= 1500.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- feature_33 > 1500.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- feature_31 > 4529.50 | | | | | | | | | | |--- feature_29 <= 543.50 | | | | | | | | | | | |--- truncated branch of depth 11 | | | | | | | | | | |--- feature_29 > 543.50 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | |--- feature_31 > 4778.50 | | | | | | | | |--- feature_27 <= 4.50 | | | | | | | | | |--- feature_23 <= 48428.00 | | | | | | | | | | |--- feature_28 <= 1684.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- feature_28 > 1684.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- feature_23 > 48428.00 | | | | | | | | | | |--- feature_29 <= 594.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- feature_29 > 594.50 | | | | | | | | | | | |--- class: 26 | | | | | | | | |--- feature_27 > 4.50 | | | | | | | | | |--- feature_28 <= 1750.50 | | | | | | | | | | |--- feature_30 <= 2278.87 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- feature_30 > 2278.87 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- feature_28 > 1750.50 | | | | | | | | | | |--- feature_0 <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- feature_0 > 0.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | |--- feature_31 > 4923.50 | | | | | | |--- feature_23 <= 45044.00 | | | | | | | |--- feature_27 <= 4.50 | | | | | | | | |--- feature_32 <= 1825.50 | | | | | | | | | |--- class: 14 | | | | | | | | |--- feature_32 > 1825.50 | | | | | | | | | |--- feature_24 <= 1744.50 | | | | | | | | | | |--- class: 14 | | | | | | | | | |--- feature_24 > 1744.50 | | | | | | | | | | |--- class: 26 | | | | | | | |--- feature_27 > 4.50 | | | | | | | | |--- feature_28 <= 1858.00 | | | | | | | | | |--- feature_2 <= 0.50 | | | | | | | | | | |--- feature_23 <= 43052.00 | | | | | | | | | | | |--- class: 14 | | | | | | | | | | |--- feature_23 > 43052.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- feature_2 > 0.50 | | | | | | | | | | |--- feature_28 <= 1705.00 | | | | | | | | | | | |--- class: 14 | | | | | | | | | | |--- feature_28 > 1705.00 | | | | | | | | | | | |--- class: 26 | | | | | | | | |--- feature_28 > 1858.00 | | | | | | | | | |--- class: 26 | | | | | | |--- feature_23 > 45044.00 | | | | | | | |--- feature_27 <= 4.50 | | | | | | | | |--- feature_30 <= 2320.66 | | | | | | | | | |--- class: 26 | | | | | | | | |--- feature_30 > 2320.66 | | | | | | | | | |--- class: 14 | | | | | | | |--- feature_27 > 4.50 | | | | | | | | |--- feature_29 <= 558.50 | | | | | | | | | |--- feature_28 <= 1674.50 | | | | | | | | | | |--- class: 14 | | | | | | | | | |--- feature_28 > 1674.50 | | | | | | | | | | |--- feature_32 <= 1799.50 | | | | | | | | | | | |--- class: 14 | | | | | | | | | | |--- feature_32 > 1799.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | |--- feature_29 > 558.50 | | | | | | | | | |--- feature_26 <= 183.50 | | | | | | | | | | |--- feature_33 <= 1468.50 | | | | | | | | | | | |--- class: 26 | | | | | | | | | | |--- feature_33 > 1468.50 | | | | | | | | | | | |--- class: 14 | | | | | | | | | |--- feature_26 > 183.50 | | | | | | | | | | |--- class: 14 | | |--- feature_30 > 2323.50 | | | |--- feature_23 <= 86012.50 | | | | |--- feature_23 <= 48027.50 | | | | | |--- feature_30 <= 2499.77 | | | | | | |--- feature_2 <= 0.50 | | | | | | | |--- feature_30 <= 2324.13 | | | | | | | | |--- class: 26 | | | | | | | |--- feature_30 > 2324.13 | | | | | | | | |--- feature_28 <= 1856.00 | | | | | | | | | |--- class: 14 | | | | | | | | |--- feature_28 > 1856.00 | | | | | | | | | |--- feature_32 <= 1871.00 | | | | | | | | | | |--- class: 14 | | | | | | | | | |--- feature_32 > 1871.00 | | | | | | | | | | |--- class: 26 | | | | | | |--- feature_2 > 0.50 | | | | | | | |--- feature_13 <= 0.50 | | | | | | | | |--- class: 14 | | | | | | | |--- feature_13 > 0.50 | | | | | | | | |--- class: 26 | | | | | |--- feature_30 > 2499.77 | | | | | | |--- feature_26 <= 157.00 | | | | | | | |--- class: 6 | | | | | | |--- feature_26 > 157.00 | | | | | | | |--- class: 27 | | | | |--- feature_23 > 48027.50 | | | | | |--- feature_31 <= 4774.50 | | | | | | |--- feature_27 <= 3.00 | | | | | | | |--- feature_29 <= 508.00 | | | | | | | | |--- feature_34 <= 142.50 | | | | | | | | | |--- feature_2 <= 0.50 | | | | | | | | | | |--- class: 0 | | | | | | | | | |--- feature_2 > 0.50 | | | | | | | | | | |--- class: 2 | | | | | | | | |--- feature_34 > 142.50 | | | | | | | | | |--- class: 0 | | | | | | | |--- feature_29 > 508.00 | | | | | | | | |--- class: 2 | | | | | | |--- feature_27 > 3.00 | | | | | | | |--- feature_29 <= 614.00 | | | | | | | | |--- feature_31 <= 4739.50 | | | | | | | | | |--- feature_30 <= 2633.01 | | | | | | | | | | |--- feature_32 <= 1908.00 | | | | | | | | | | | |--- class: 14 | | | | | | | | | | |--- feature_32 > 1908.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- feature_30 > 2633.01 | | | | | | | | | | |--- class: 26 | | | | | | | | |--- feature_31 > 4739.50 | | | | | | | | | |--- feature_25 <= 178.00 | | | | | | | | | | |--- class: 26 | | | | | | | | | |--- feature_25 > 178.00 | | | | | | | | | | |--- feature_34 <= 170.00 | | | | | | | | | | | |--- class: 14 | | | | | | | | | | |--- feature_34 > 170.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | |--- feature_29 > 614.00 | | | | | | | | |--- feature_23 <= 50517.00 | | | | | | | | | |--- class: 14 | | | | | | | | |--- feature_23 > 50517.00 | | | | | | | | | |--- class: 26 | | | | | |--- feature_31 > 4774.50 | | | | | | |--- feature_27 <= 3.00 | | | | | | | |--- feature_29 <= 521.50 | | | | | | | | |--- feature_34 <= 195.50 | | | | | | | | | |--- feature_31 <= 4928.50 | | | | | | | | | | |--- class: 0 | | | | | | | | | |--- feature_31 > 4928.50 | | | | | | | | | | |--- feature_33 <= 1417.00 | | | | | | | | | | | |--- class: 2 | | | | | | | | | | |--- feature_33 > 1417.00 | | | | | | | | | | | |--- class: 0 | | | | | | | | |--- feature_34 > 195.50 | | | | | | | | | |--- class: 2 | | | | | | | |--- feature_29 > 521.50 | | | | | | | | |--- class: 2 | | | | | | |--- feature_27 > 3.00 | | | | | | | |--- feature_31 <= 5055.50 | | | | | | | | |--- feature_31 <= 4857.50 | | | | | | | | | |--- feature_29 <= 587.00 | | | | | | | | | | |--- feature_32 <= 1890.00 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- feature_32 > 1890.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- feature_29 > 587.00 | | | | | | | | | | |--- feature_32 <= 1773.00 | | | | | | | | | | | |--- class: 14 | | | | | | | | | | |--- feature_32 > 1773.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- feature_31 > 4857.50 | | | | | | | | | |--- feature_33 <= 1512.50 | | | | | | | | | | |--- feature_32 <= 1793.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_32 > 1793.00 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | |--- feature_33 > 1512.50 | | | | | | | | | | |--- feature_30 <= 2427.06 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_30 > 2427.06 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | |--- feature_31 > 5055.50 | | | | | | | | |--- feature_23 <= 65744.50 | | | | | | | | | |--- feature_23 <= 64364.00 | | | | | | | | | | |--- feature_32 <= 1782.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_32 > 1782.00 | | | | | | | | | | | |--- class: 26 | | | | | | | | | |--- feature_23 > 64364.00 | | | | | | | | | | |--- feature_12 <= 0.50 | | | | | | | | | | | |--- class: 26 | | | | | | | | | | |--- feature_12 > 0.50 | | | | | | | | | | | |--- class: 11 | | | | | | | | |--- feature_23 > 65744.50 | | | | | | | | | |--- feature_29 <= 623.00 | | | | | | | | | | |--- class: 26 | | | | | | | | | |--- feature_29 > 623.00 | | | | | | | | | | |--- feature_32 <= 1856.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_32 > 1856.00 | | | | | | | | | | | |--- class: 11 | | | |--- feature_23 > 86012.50 | | | | |--- feature_27 <= 3.00 | | | | | |--- feature_33 <= 1360.00 | | | | | | |--- feature_30 <= 2339.99 | | | | | | | |--- feature_29 <= 524.50 | | | | | | | | |--- class: 10 | | | | | | | |--- feature_29 > 524.50 | | | | | | | | |--- class: 9 | | | | | | |--- feature_30 > 2339.99 | | | | | | | |--- class: 9 | | | | | |--- feature_33 > 1360.00 | | | | | | |--- feature_29 <= 482.50 | | | | | | | |--- feature_30 <= 2598.80 | | | | | | | | |--- feature_26 <= 587.00 | | | | | | | | | |--- feature_28 <= 1947.00 | | | | | | | | | | |--- feature_25 <= 298.50 | | | | | | | | | | | |--- class: 10 | | | | | | | | | | |--- feature_25 > 298.50 | | | | | | | | | | | |--- class: 18 | | | | | | | | | |--- feature_28 > 1947.00 | | | | | | | | | | |--- class: 9 | | | | | | | | |--- feature_26 > 587.00 | | | | | | | | | |--- class: 10 | | | | | | | |--- feature_30 > 2598.80 | | | | | | | | |--- class: 9 | | | | | | |--- feature_29 > 482.50 | | | | | | | |--- feature_30 <= 2356.35 | | | | | | | | |--- class: 9 | | | | | | | |--- feature_30 > 2356.35 | | | | | | | | |--- class: 10 | | | | |--- feature_27 > 3.00 | | | | | |--- feature_31 <= 5029.50 | | | | | | |--- feature_34 <= 191.00 | | | | | | | |--- feature_33 <= 1382.00 | | | | | | | | |--- feature_29 <= 578.50 | | | | | | | | | |--- class: 11 | | | | | | | | |--- feature_29 > 578.50 | | | | | | | | | |--- class: 10 | | | | | | | |--- feature_33 > 1382.00 | | | | | | | | |--- feature_12 <= 0.50 | | | | | | | | | |--- feature_32 <= 1854.00 | | | | | | | | | | |--- class: 26 | | | | | | | | | |--- feature_32 > 1854.00 | | | | | | | | | | |--- class: 11 | | | | | | | | |--- feature_12 > 0.50 | | | | | | | | | |--- class: 26 | | | | | | |--- feature_34 > 191.00 | | | | | | | |--- feature_26 <= 362.50 | | | | | | | | |--- class: 11 | | | | | | | |--- feature_26 > 362.50 | | | | | | | | |--- feature_23 <= 138048.00 | | | | | | | | | |--- feature_28 <= 2240.00 | | | | | | | | | | |--- class: 26 | | | | | | | | | |--- feature_28 > 2240.00 | | | | | | | | | | |--- class: 11 | | | | | | | | |--- feature_23 > 138048.00 | | | | | | | | | |--- feature_28 <= 2083.50 | | | | | | | | | | |--- feature_31 <= 4932.00 | | | | | | | | | | | |--- class: 10 | | | | | | | | | | |--- feature_31 > 4932.00 | | | | | | | | | | | |--- class: 18 | | | | | | | | | |--- feature_28 > 2083.50 | | | | | | | | | | |--- class: 11 | | | | | |--- feature_31 > 5029.50 | | | | | | |--- feature_27 <= 4.50 | | | | | | | |--- feature_33 <= 1360.50 | | | | | | | | |--- class: 10 | | | | | | | |--- feature_33 > 1360.50 | | | | | | | | |--- feature_30 <= 2358.61 | | | | | | | | | |--- feature_24 <= 2986.50 | | | | | | | | | | |--- class: 11 | | | | | | | | | |--- feature_24 > 2986.50 | | | | | | | | | | |--- class: 10 | | | | | | | | |--- feature_30 > 2358.61 | | | | | | | | | |--- feature_23 <= 87802.00 | | | | | | | | | | |--- feature_30 <= 2568.96 | | | | | | | | | | | |--- class: 11 | | | | | | | | | | |--- feature_30 > 2568.96 | | | | | | | | | | | |--- class: 26 | | | | | | | | | |--- feature_23 > 87802.00 | | | | | | | | | | |--- feature_26 <= 529.00 | | | | | | | | | | | |--- class: 11 | | | | | | | | | | |--- feature_26 > 529.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | |--- feature_27 > 4.50 | | | | | | | |--- feature_32 <= 1866.50 | | | | | | | | |--- class: 26 | | | | | | | |--- feature_32 > 1866.50 | | | | | | | | |--- feature_29 <= 462.50 | | | | | | | | | |--- class: 18 | | | | | | | | |--- feature_29 > 462.50 | | | | | | | | | |--- class: 11 |--- feature_33 > 1537.50 | |--- feature_30 <= 2617.51 | | |--- feature_31 <= 4468.50 | | | |--- feature_30 <= 1814.98 | | | | |--- feature_31 <= 3845.50 | | | | | |--- feature_34 <= 162.50 | | | | | | |--- feature_33 <= 1704.00 | | | | | | | |--- feature_32 <= 1848.50 | | | | | | | | |--- feature_26 <= 172.50 | | | | | | | | | |--- class: 13 | | | | | | | | |--- feature_26 > 172.50 | | | | | | | | | |--- class: 1 | | | | | | | |--- feature_32 > 1848.50 | | | | | | | | |--- class: 6 | | | | | | |--- feature_33 > 1704.00 | | | | | | | |--- feature_28 <= 1034.50 | | | | | | | | |--- class: 6 | | | | | | | |--- feature_28 > 1034.50 | | | | | | | | |--- class: 4 | | | | | |--- feature_34 > 162.50 | | | | | | |--- feature_33 <= 1758.00 | | | | | | | |--- class: 24 | | | | | | |--- feature_33 > 1758.00 | | | | | | | |--- class: 4 | | | | |--- feature_31 > 3845.50 | | | | | |--- feature_27 <= 4.50 | | | | | | |--- feature_33 <= 1611.50 | | | | | | | |--- feature_28 <= 1267.50 | | | | | | | | |--- feature_30 <= 1592.18 | | | | | | | | | |--- feature_31 <= 3897.50 | | | | | | | | | | |--- class: 13 | | | | | | | | | |--- feature_31 > 3897.50 | | | | | | | | | | |--- class: 1 | | | | | | | | |--- feature_30 > 1592.18 | | | | | | | | | |--- class: 6 | | | | | | | |--- feature_28 > 1267.50 | | | | | | | | |--- class: 27 | | | | | | |--- feature_33 > 1611.50 | | | | | | | |--- feature_28 <= 1497.00 | | | | | | | | |--- class: 4 | | | | | | | |--- feature_28 > 1497.00 | | | | | | | | |--- class: 23 | | | | | |--- feature_27 > 4.50 | | | | | | |--- feature_23 <= 15767.50 | | | | | | | |--- feature_30 <= 1650.00 | | | | | | | | |--- feature_31 <= 4305.50 | | | | | | | | | |--- feature_29 <= 341.00 | | | | | | | | | | |--- class: 7 | | | | | | | | | |--- feature_29 > 341.00 | | | | | | | | | | |--- class: 6 | | | | | | | | |--- feature_31 > 4305.50 | | | | | | | | | |--- class: 27 | | | | | | | |--- feature_30 > 1650.00 | | | | | | | | |--- feature_33 <= 1797.00 | | | | | | | | | |--- feature_26 <= 80.00 | | | | | | | | | | |--- class: 12 | | | | | | | | | |--- feature_26 > 80.00 | | | | | | | | | | |--- feature_25 <= 76.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- feature_25 > 76.00 | | | | | | | | | | | |--- class: 7 | | | | | | | | |--- feature_33 > 1797.00 | | | | | | | | | |--- class: 8 | | | | | | |--- feature_23 > 15767.50 | | | | | | | |--- feature_34 <= 116.50 | | | | | | | | |--- feature_26 <= 78.00 | | | | | | | | | |--- class: 5 | | | | | | | | |--- feature_26 > 78.00 | | | | | | | | | |--- feature_33 <= 1644.00 | | | | | | | | | | |--- feature_28 <= 1152.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- feature_28 > 1152.00 | | | | | | | | | | | |--- truncated branch of depth 12 | | | | | | | | | |--- feature_33 > 1644.00 | | | | | | | | | | |--- feature_29 <= 508.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_29 > 508.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | |--- feature_34 > 116.50 | | | | | | | | |--- feature_23 <= 32902.50 | | | | | | | | | |--- feature_30 <= 1694.99 | | | | | | | | | | |--- feature_34 <= 119.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- feature_34 > 119.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | |--- feature_30 > 1694.99 | | | | | | | | | | |--- feature_26 <= 106.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- feature_26 > 106.50 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | |--- feature_23 > 32902.50 | | | | | | | | | |--- feature_29 <= 507.00 | | | | | | | | | | |--- class: 25 | | | | | | | | | |--- feature_29 > 507.00 | | | | | | | | | | |--- class: 8 | | | |--- feature_30 > 1814.98 | | | | |--- feature_33 <= 1752.00 | | | | | |--- feature_23 <= 44518.50 | | | | | | |--- feature_31 <= 4288.50 | | | | | | | |--- feature_25 <= 76.50 | | | | | | | | |--- feature_24 <= 1440.00 | | | | | | | | | |--- feature_29 <= 481.50 | | | | | | | | | | |--- feature_32 <= 1712.00 | | | | | | | | | | | |--- class: 8 | | | | | | | | | | |--- feature_32 > 1712.00 | | | | | | | | | | | |--- class: 12 | | | | | | | | | |--- feature_29 > 481.50 | | | | | | | | | | |--- feature_24 <= 1245.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_24 > 1245.50 | | | | | | | | | | | |--- class: 7 | | | | | | | | |--- feature_24 > 1440.00 | | | | | | | | | |--- feature_30 <= 1892.06 | | | | | | | | | | |--- class: 24 | | | | | | | | | |--- feature_30 > 1892.06 | | | | | | | | | | |--- feature_34 <= 120.00 | | | | | | | | | | | |--- class: 7 | | | | | | | | | | |--- feature_34 > 120.00 | | | | | | | | | | | |--- class: 8 | | | | | | | |--- feature_25 > 76.50 | | | | | | | | |--- feature_29 <= 421.50 | | | | | | | | | |--- feature_0 <= 0.50 | | | | | | | | | | |--- feature_31 <= 4239.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_31 > 4239.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- feature_0 > 0.50 | | | | | | | | | | |--- class: 24 | | | | | | | | |--- feature_29 > 421.50 | | | | | | | | | |--- feature_34 <= 130.50 | | | | | | | | | | |--- feature_28 <= 1349.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- feature_28 > 1349.00 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | | |--- feature_34 > 130.50 | | | | | | | | | | |--- feature_24 <= 2045.00 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | | |--- feature_24 > 2045.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | |--- feature_31 > 4288.50 | | | | | | | |--- feature_34 <= 154.50 | | | | | | | | |--- feature_26 <= 136.50 | | | | | | | | | |--- feature_30 <= 1891.53 | | | | | | | | | | |--- feature_26 <= 105.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- feature_26 > 105.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | |--- feature_30 > 1891.53 | | | | | | | | | | |--- feature_29 <= 493.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- feature_29 > 493.50 | | | | | | | | | | | |--- truncated branch of depth 13 | | | | | | | | |--- feature_26 > 136.50 | | | | | | | | | |--- feature_28 <= 1533.50 | | | | | | | | | | |--- feature_29 <= 477.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_29 > 477.50 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | |--- feature_28 > 1533.50 | | | | | | | | | | |--- feature_23 <= 33354.00 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- feature_23 > 33354.00 | | | | | | | | | | | |--- truncated branch of depth 12 | | | | | | | |--- feature_34 > 154.50 | | | | | | | | |--- feature_33 <= 1612.50 | | | | | | | | | |--- feature_25 <= 132.00 | | | | | | | | | | |--- feature_28 <= 1523.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_28 > 1523.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- feature_25 > 132.00 | | | | | | | | | | |--- feature_33 <= 1598.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_33 > 1598.50 | | | | | | | | | | | |--- class: 7 | | | | | | | | |--- feature_33 > 1612.50 | | | | | | | | | |--- feature_28 <= 1482.50 | | | | | | | | | | |--- feature_25 <= 98.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- feature_25 > 98.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- feature_28 > 1482.50 | | | | | | | | | | |--- feature_30 <= 2346.06 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- feature_30 > 2346.06 | | | | | | | | | | | |--- class: 7 | | | | | |--- feature_23 > 44518.50 | | | | | | |--- feature_24 <= 2085.00 | | | | | | | |--- feature_30 <= 2092.04 | | | | | | | | |--- feature_13 <= 0.50 | | | | | | | | | |--- feature_29 <= 419.50 | | | | | | | | | | |--- class: 27 | | | | | | | | | |--- feature_29 > 419.50 | | | | | | | | | | |--- feature_30 <= 2078.76 | | | | | | | | | | | |--- class: 7 | | | | | | | | | | |--- feature_30 > 2078.76 | | | | | | | | | | | |--- class: 25 | | | | | | | | |--- feature_13 > 0.50 | | | | | | | | | |--- feature_31 <= 4434.00 | | | | | | | | | | |--- class: 8 | | | | | | | | | |--- feature_31 > 4434.00 | | | | | | | | | | |--- class: 25 | | | | | | | |--- feature_30 > 2092.04 | | | | | | | | |--- feature_29 <= 718.00 | | | | | | | | | |--- feature_32 <= 2090.50 | | | | | | | | | | |--- feature_32 <= 1723.00 | | | | | | | | | | | |--- class: 8 | | | | | | | | | | |--- feature_32 > 1723.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- feature_32 > 2090.50 | | | | | | | | | | |--- class: 25 | | | | | | | | |--- feature_29 > 718.00 | | | | | | | | | |--- feature_30 <= 2485.41 | | | | | | | | | | |--- class: 7 | | | | | | | | | |--- feature_30 > 2485.41 | | | | | | | | | | |--- feature_33 <= 1632.50 | | | | | | | | | | | |--- class: 7 | | | | | | | | | | |--- feature_33 > 1632.50 | | | | | | | | | | | |--- class: 25 | | | | | | |--- feature_24 > 2085.00 | | | | | | | |--- feature_13 <= 0.50 | | | | | | | | |--- class: 25 | | | | | | | |--- feature_13 > 0.50 | | | | | | | | |--- class: 8 | | | | |--- feature_33 > 1752.00 | | | | | |--- feature_27 <= 4.50 | | | | | | |--- feature_34 <= 122.00 | | | | | | | |--- feature_24 <= 1804.00 | | | | | | | | |--- feature_34 <= 107.00 | | | | | | | | | |--- feature_29 <= 731.00 | | | | | | | | | | |--- class: 4 | | | | | | | | | |--- feature_29 > 731.00 | | | | | | | | | | |--- class: 5 | | | | | | | | |--- feature_34 > 107.00 | | | | | | | | | |--- class: 4 | | | | | | | |--- feature_24 > 1804.00 | | | | | | | | |--- class: 8 | | | | | | |--- feature_34 > 122.00 | | | | | | | |--- feature_24 <= 2394.50 | | | | | | | | |--- feature_29 <= 752.00 | | | | | | | | | |--- feature_34 <= 137.00 | | | | | | | | | | |--- class: 4 | | | | | | | | | |--- feature_34 > 137.00 | | | | | | | | | | |--- feature_30 <= 2239.12 | | | | | | | | | | | |--- class: 8 | | | | | | | | | | |--- feature_30 > 2239.12 | | | | | | | | | | | |--- class: 5 | | | | | | | | |--- feature_29 > 752.00 | | | | | | | | | |--- feature_32 <= 1743.00 | | | | | | | | | | |--- class: 8 | | | | | | | | | |--- feature_32 > 1743.00 | | | | | | | | | | |--- feature_29 <= 891.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- feature_29 > 891.00 | | | | | | | | | | | |--- class: 4 | | | | | | | |--- feature_24 > 2394.50 | | | | | | | | |--- class: 23 | | | | | |--- feature_27 > 4.50 | | | | | | |--- feature_29 <= 518.50 | | | | | | | |--- feature_24 <= 1533.50 | | | | | | | | |--- feature_31 <= 4218.50 | | | | | | | | | |--- class: 8 | | | | | | | | |--- feature_31 > 4218.50 | | | | | | | | | |--- feature_25 <= 78.50 | | | | | | | | | | |--- class: 5 | | | | | | | | | |--- feature_25 > 78.50 | | | | | | | | | | |--- feature_24 <= 1306.00 | | | | | | | | | | | |--- class: 8 | | | | | | | | | | |--- feature_24 > 1306.00 | | | | | | | | | | | |--- class: 5 | | | | | | | |--- feature_24 > 1533.50 | | | | | | | | |--- class: 8 | | | | | | |--- feature_29 > 518.50 | | | | | | | |--- feature_30 <= 2394.61 | | | | | | | | |--- feature_34 <= 106.50 | | | | | | | | | |--- class: 5 | | | | | | | | |--- feature_34 > 106.50 | | | | | | | | | |--- feature_29 <= 545.50 | | | | | | | | | | |--- feature_23 <= 25548.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_23 > 25548.00 | | | | | | | | | | | |--- class: 5 | | | | | | | | | |--- feature_29 > 545.50 | | | | | | | | | | |--- feature_30 <= 1817.97 | | | | | | | | | | | |--- class: 22 | | | | | | | | | | |--- feature_30 > 1817.97 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | |--- feature_30 > 2394.61 | | | | | | | | |--- class: 4 | | |--- feature_31 > 4468.50 | | | |--- feature_0 <= 0.50 | | | | |--- feature_29 <= 609.50 | | | | | |--- feature_33 <= 1579.50 | | | | | | |--- feature_23 <= 30292.50 | | | | | | | |--- feature_33 <= 1559.50 | | | | | | | | |--- feature_23 <= 27086.50 | | | | | | | | | |--- feature_34 <= 168.00 | | | | | | | | | | |--- feature_28 <= 1389.00 | | | | | | | | | | | |--- class: 27 | | | | | | | | | | |--- feature_28 > 1389.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- feature_34 > 168.00 | | | | | | | | | | |--- class: 7 | | | | | | | | |--- feature_23 > 27086.50 | | | | | | | | | |--- feature_26 <= 140.00 | | | | | | | | | | |--- class: 8 | | | | | | | | | |--- feature_26 > 140.00 | | | | | | | | | | |--- class: 27 | | | | | | | |--- feature_33 > 1559.50 | | | | | | | | |--- feature_30 <= 1501.00 | | | | | | | | | |--- class: 13 | | | | | | | | |--- feature_30 > 1501.00 | | | | | | | | | |--- feature_31 <= 4553.50 | | | | | | | | | | |--- feature_28 <= 1370.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_28 > 1370.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- feature_31 > 4553.50 | | | | | | | | | | |--- feature_30 <= 1767.53 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_30 > 1767.53 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | |--- feature_23 > 30292.50 | | | | | | | |--- feature_31 <= 4665.00 | | | | | | | | |--- feature_32 <= 1832.50 | | | | | | | | | |--- feature_29 <= 580.00 | | | | | | | | | | |--- feature_29 <= 446.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_29 > 446.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- feature_29 > 580.00 | | | | | | | | | | |--- feature_2 <= 0.50 | | | | | | | | | | | |--- class: 7 | | | | | | | | | | |--- feature_2 > 0.50 | | | | | | | | | | | |--- class: 14 | | | | | | | | |--- feature_32 > 1832.50 | | | | | | | | | |--- feature_28 <= 1637.00 | | | | | | | | | | |--- feature_34 <= 128.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_34 > 128.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- feature_28 > 1637.00 | | | | | | | | | | |--- feature_5 <= 0.50 | | | | | | | | | | | |--- class: 14 | | | | | | | | | | |--- feature_5 > 0.50 | | | | | | | | | | | |--- class: 25 | | | | | | | |--- feature_31 > 4665.00 | | | | | | | | |--- feature_28 <= 1781.50 | | | | | | | | | |--- class: 14 | | | | | | | | |--- feature_28 > 1781.50 | | | | | | | | | |--- feature_33 <= 1548.50 | | | | | | | | | | |--- class: 26 | | | | | | | | | |--- feature_33 > 1548.50 | | | | | | | | | | |--- class: 14 | | | | | |--- feature_33 > 1579.50 | | | | | | |--- feature_33 <= 1778.50 | | | | | | | |--- feature_30 <= 1800.26 | | | | | | | | |--- feature_28 <= 1306.00 | | | | | | | | | |--- feature_25 <= 86.50 | | | | | | | | | | |--- class: 27 | | | | | | | | | |--- feature_25 > 86.50 | | | | | | | | | | |--- class: 7 | | | | | | | | |--- feature_28 > 1306.00 | | | | | | | | | |--- feature_34 <= 128.50 | | | | | | | | | | |--- class: 8 | | | | | | | | | |--- feature_34 > 128.50 | | | | | | | | | | |--- feature_34 <= 161.00 | | | | | | | | | | | |--- class: 22 | | | | | | | | | | |--- feature_34 > 161.00 | | | | | | | | | | | |--- class: 7 | | | | | | | |--- feature_30 > 1800.26 | | | | | | | | |--- feature_29 <= 580.50 | | | | | | | | | |--- feature_33 <= 1629.50 | | | | | | | | | | |--- feature_24 <= 1506.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- feature_24 > 1506.50 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | |--- feature_33 > 1629.50 | | | | | | | | | | |--- feature_31 <= 4925.00 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | | |--- feature_31 > 4925.00 | | | | | | | | | | | |--- class: 23 | | | | | | | | |--- feature_29 > 580.50 | | | | | | | | | |--- feature_30 <= 2227.33 | | | | | | | | | | |--- feature_28 <= 1432.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_28 > 1432.00 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | |--- feature_30 > 2227.33 | | | | | | | | | | |--- feature_30 <= 2422.34 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- feature_30 > 2422.34 | | | | | | | | | | | |--- class: 23 | | | | | | |--- feature_33 > 1778.50 | | | | | | | |--- feature_23 <= 28696.00 | | | | | | | | |--- feature_27 <= 4.50 | | | | | | | | | |--- class: 4 | | | | | | | | |--- feature_27 > 4.50 | | | | | | | | | |--- feature_33 <= 1819.00 | | | | | | | | | | |--- class: 5 | | | | | | | | | |--- feature_33 > 1819.00 | | | | | | | | | | |--- class: 8 | | | | | | | |--- feature_23 > 28696.00 | | | | | | | | |--- feature_24 <= 1545.50 | | | | | | | | | |--- feature_33 <= 1861.00 | | | | | | | | | | |--- class: 5 | | | | | | | | | |--- feature_33 > 1861.00 | | | | | | | | | | |--- class: 8 | | | | | | | | |--- feature_24 > 1545.50 | | | | | | | | | |--- feature_34 <= 158.00 | | | | | | | | | | |--- feature_33 <= 1844.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_33 > 1844.00 | | | | | | | | | | | |--- class: 14 | | | | | | | | | |--- feature_34 > 158.00 | | | | | | | | | | |--- class: 19 | | | | |--- feature_29 > 609.50 | | | | | |--- feature_23 <= 36328.00 | | | | | | |--- feature_27 <= 4.50 | | | | | | | |--- feature_34 <= 121.00 | | | | | | | | |--- class: 4 | | | | | | | |--- feature_34 > 121.00 | | | | | | | | |--- feature_30 <= 2548.46 | | | | | | | | | |--- feature_32 <= 1840.00 | | | | | | | | | | |--- feature_29 <= 899.00 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- feature_29 > 899.00 | | | | | | | | | | | |--- class: 4 | | | | | | | | | |--- feature_32 > 1840.00 | | | | | | | | | | |--- feature_25 <= 97.00 | | | | | | | | | | | |--- class: 4 | | | | | | | | | | |--- feature_25 > 97.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- feature_30 > 2548.46 | | | | | | | | | |--- feature_32 <= 2024.50 | | | | | | | | | | |--- class: 20 | | | | | | | | | |--- feature_32 > 2024.50 | | | | | | | | | | |--- class: 21 | | | | | | |--- feature_27 > 4.50 | | | | | | | |--- feature_29 <= 884.00 | | | | | | | | |--- feature_30 <= 1964.85 | | | | | | | | | |--- feature_30 <= 1916.25 | | | | | | | | | | |--- feature_34 <= 155.00 | | | | | | | | | | | |--- class: 22 | | | | | | | | | | |--- feature_34 > 155.00 | | | | | | | | | | | |--- class: 8 | | | | | | | | | |--- feature_30 > 1916.25 | | | | | | | | | | |--- feature_33 <= 1608.00 | | | | | | | | | | | |--- class: 25 | | | | | | | | | | |--- feature_33 > 1608.00 | | | | | | | | | | | |--- class: 7 | | | | | | | | |--- feature_30 > 1964.85 | | | | | | | | | |--- feature_23 <= 31306.50 | | | | | | | | | | |--- feature_25 <= 54.50 | | | | | | | | | | | |--- class: 5 | | | | | | | | | | |--- feature_25 > 54.50 | | | | | | | | | | | |--- truncated branch of depth 11 | | | | | | | | | |--- feature_23 > 31306.50 | | | | | | | | | | |--- feature_32 <= 1884.00 | | | | | | | | | | | |--- truncated branch of depth 11 | | | | | | | | | | |--- feature_32 > 1884.00 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | |--- feature_29 > 884.00 | | | | | | | | |--- class: 4 | | | | | |--- feature_23 > 36328.00 | | | | | | |--- feature_31 <= 4695.50 | | | | | | | |--- feature_32 <= 1854.00 | | | | | | | | |--- feature_33 <= 1664.00 | | | | | | | | | |--- feature_29 <= 610.50 | | | | | | | | | | |--- class: 25 | | | | | | | | | |--- feature_29 > 610.50 | | | | | | | | | | |--- class: 8 | | | | | | | | |--- feature_33 > 1664.00 | | | | | | | | | |--- feature_26 <= 151.00 | | | | | | | | | | |--- feature_29 <= 861.50 | | | | | | | | | | | |--- class: 8 | | | | | | | | | | |--- feature_29 > 861.50 | | | | | | | | | | | |--- class: 21 | | | | | | | | | |--- feature_26 > 151.00 | | | | | | | | | | |--- class: 25 | | | | | | | |--- feature_32 > 1854.00 | | | | | | | | |--- feature_24 <= 2052.50 | | | | | | | | | |--- feature_30 <= 2292.75 | | | | | | | | | | |--- class: 25 | | | | | | | | | |--- feature_30 > 2292.75 | | | | | | | | | | |--- feature_32 <= 1998.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- feature_32 > 1998.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- feature_24 > 2052.50 | | | | | | | | | |--- class: 23 | | | | | | |--- feature_31 > 4695.50 | | | | | | | |--- feature_33 <= 1600.50 | | | | | | | | |--- feature_28 <= 1779.00 | | | | | | | | | |--- feature_23 <= 46001.50 | | | | | | | | | | |--- class: 14 | | | | | | | | | |--- feature_23 > 46001.50 | | | | | | | | | | |--- class: 26 | | | | | | | | |--- feature_28 > 1779.00 | | | | | | | | | |--- feature_29 <= 718.50 | | | | | | | | | | |--- class: 11 | | | | | | | | | |--- feature_29 > 718.50 | | | | | | | | | | |--- class: 26 | | | | | | | |--- feature_33 > 1600.50 | | | | | | | | |--- feature_33 <= 1851.00 | | | | | | | | | |--- feature_1 <= 0.50 | | | | | | | | | | |--- class: 23 | | | | | | | | | |--- feature_1 > 0.50 | | | | | | | | | | |--- feature_24 <= 2124.00 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | | | |--- feature_24 > 2124.00 | | | | | | | | | | | |--- class: 23 | | | | | | | | |--- feature_33 > 1851.00 | | | | | | | | | |--- feature_29 <= 1059.50 | | | | | | | | | | |--- class: 21 | | | | | | | | | |--- feature_29 > 1059.50 | | | | | | | | | | |--- class: 20 | | | |--- feature_0 > 0.50 | | | | |--- feature_32 <= 1966.50 | | | | | |--- feature_33 <= 1598.50 | | | | | | |--- feature_30 <= 2374.69 | | | | | | | |--- feature_31 <= 4685.00 | | | | | | | | |--- feature_28 <= 1770.50 | | | | | | | | | |--- feature_29 <= 616.00 | | | | | | | | | | |--- feature_28 <= 1642.00 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- feature_28 > 1642.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- feature_29 > 616.00 | | | | | | | | | | |--- feature_34 <= 143.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_34 > 143.00 | | | | | | | | | | | |--- class: 25 | | | | | | | | |--- feature_28 > 1770.50 | | | | | | | | | |--- feature_24 <= 2224.50 | | | | | | | | | | |--- feature_31 <= 4496.50 | | | | | | | | | | | |--- class: 8 | | | | | | | | | | |--- feature_31 > 4496.50 | | | | | | | | | | | |--- class: 25 | | | | | | | | | |--- feature_24 > 2224.50 | | | | | | | | | | |--- class: 14 | | | | | | | |--- feature_31 > 4685.00 | | | | | | | | |--- feature_28 <= 1848.00 | | | | | | | | | |--- feature_27 <= 4.50 | | | | | | | | | | |--- class: 26 | | | | | | | | | |--- feature_27 > 4.50 | | | | | | | | | | |--- feature_32 <= 1877.00 | | | | | | | | | | | |--- class: 14 | | | | | | | | | | |--- feature_32 > 1877.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- feature_28 > 1848.00 | | | | | | | | | |--- class: 26 | | | | | | |--- feature_30 > 2374.69 | | | | | | | |--- feature_31 <= 4871.50 | | | | | | | | |--- feature_32 <= 1837.00 | | | | | | | | | |--- feature_31 <= 4701.50 | | | | | | | | | | |--- class: 25 | | | | | | | | | |--- feature_31 > 4701.50 | | | | | | | | | | |--- class: 26 | | | | | | | | |--- feature_32 > 1837.00 | | | | | | | | | |--- feature_30 <= 2379.80 | | | | | | | | | | |--- class: 26 | | | | | | | | | |--- feature_30 > 2379.80 | | | | | | | | | | |--- feature_34 <= 131.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_34 > 131.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | |--- feature_31 > 4871.50 | | | | | | | | |--- feature_24 <= 2459.00 | | | | | | | | | |--- feature_32 <= 1930.00 | | | | | | | | | | |--- feature_30 <= 2535.06 | | | | | | | | | | | |--- class: 26 | | | | | | | | | | |--- feature_30 > 2535.06 | | | | | | | | | | | |--- class: 22 | | | | | | | | | |--- feature_32 > 1930.00 | | | | | | | | | | |--- class: 25 | | | | | | | | |--- feature_24 > 2459.00 | | | | | | | | | |--- feature_29 <= 707.50 | | | | | | | | | | |--- feature_29 <= 643.50 | | | | | | | | | | | |--- class: 25 | | | | | | | | | | |--- feature_29 > 643.50 | | | | | | | | | | | |--- class: 11 | | | | | | | | | |--- feature_29 > 707.50 | | | | | | | | | | |--- class: 26 | | | | | |--- feature_33 > 1598.50 | | | | | | |--- feature_33 <= 1808.00 | | | | | | | |--- feature_31 <= 4792.00 | | | | | | | | |--- feature_33 <= 1627.50 | | | | | | | | | |--- feature_30 <= 2325.07 | | | | | | | | | | |--- feature_23 <= 39001.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- feature_23 > 39001.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- feature_30 > 2325.07 | | | | | | | | | | |--- feature_32 <= 1962.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_32 > 1962.50 | | | | | | | | | | | |--- class: 23 | | | | | | | | |--- feature_33 > 1627.50 | | | | | | | | | |--- feature_31 <= 4514.50 | | | | | | | | | | |--- feature_34 <= 161.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- feature_34 > 161.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- feature_31 > 4514.50 | | | | | | | | | | |--- feature_29 <= 732.50 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | | |--- feature_29 > 732.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | |--- feature_31 > 4792.00 | | | | | | | | |--- feature_29 <= 705.00 | | | | | | | | | |--- feature_28 <= 1866.00 | | | | | | | | | | |--- feature_34 <= 144.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_34 > 144.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- feature_28 > 1866.00 | | | | | | | | | | |--- feature_33 <= 1667.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_33 > 1667.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- feature_29 > 705.00 | | | | | | | | | |--- feature_32 <= 1856.50 | | | | | | | | | | |--- class: 25 | | | | | | | | | |--- feature_32 > 1856.50 | | | | | | | | | | |--- class: 22 | | | | | | |--- feature_33 > 1808.00 | | | | | | | |--- feature_25 <= 121.00 | | | | | | | | |--- feature_27 <= 4.50 | | | | | | | | | |--- feature_31 <= 4855.50 | | | | | | | | | | |--- class: 5 | | | | | | | | | |--- feature_31 > 4855.50 | | | | | | | | | | |--- class: 8 | | | | | | | | |--- feature_27 > 4.50 | | | | | | | | | |--- feature_29 <= 583.50 | | | | | | | | | | |--- class: 25 | | | | | | | | | |--- feature_29 > 583.50 | | | | | | | | | | |--- class: 8 | | | | | | | |--- feature_25 > 121.00 | | | | | | | | |--- class: 23 | | | | |--- feature_32 > 1966.50 | | | | | |--- feature_31 <= 4622.50 | | | | | | |--- feature_33 <= 1708.50 | | | | | | | |--- feature_32 <= 2083.50 | | | | | | | | |--- feature_30 <= 2535.78 | | | | | | | | | |--- class: 7 | | | | | | | | |--- feature_30 > 2535.78 | | | | | | | | | |--- class: 25 | | | | | | | |--- feature_32 > 2083.50 | | | | | | | | |--- feature_26 <= 180.00 | | | | | | | | | |--- class: 23 | | | | | | | | |--- feature_26 > 180.00 | | | | | | | | | |--- class: 25 | | | | | | |--- feature_33 > 1708.50 | | | | | | | |--- class: 25 | | | | | |--- feature_31 > 4622.50 | | | | | | |--- feature_23 <= 59516.50 | | | | | | | |--- feature_31 <= 4729.00 | | | | | | | | |--- feature_30 <= 2472.64 | | | | | | | | | |--- feature_25 <= 135.50 | | | | | | | | | | |--- class: 23 | | | | | | | | | |--- feature_25 > 135.50 | | | | | | | | | | |--- class: 25 | | | | | | | | |--- feature_30 > 2472.64 | | | | | | | | | |--- class: 25 | | | | | | | |--- feature_31 > 4729.00 | | | | | | | | |--- feature_24 <= 1975.00 | | | | | | | | | |--- feature_31 <= 4818.50 | | | | | | | | | | |--- class: 25 | | | | | | | | | |--- feature_31 > 4818.50 | | | | | | | | | | |--- class: 22 | | | | | | | | |--- feature_24 > 1975.00 | | | | | | | | | |--- feature_26 <= 260.50 | | | | | | | | | | |--- class: 23 | | | | | | | | | |--- feature_26 > 260.50 | | | | | | | | | | |--- class: 25 | | | | | | |--- feature_23 > 59516.50 | | | | | | | |--- feature_33 <= 1571.50 | | | | | | | | |--- class: 26 | | | | | | | |--- feature_33 > 1571.50 | | | | | | | | |--- feature_32 <= 1972.50 | | | | | | | | | |--- feature_29 <= 644.50 | | | | | | | | | | |--- class: 23 | | | | | | | | | |--- feature_29 > 644.50 | | | | | | | | | | |--- class: 25 | | | | | | | | |--- feature_32 > 1972.50 | | | | | | | | | |--- class: 23 | |--- feature_30 > 2617.51 | | |--- feature_33 <= 2112.00 | | | |--- feature_26 <= 218.50 | | | | |--- feature_28 <= 2057.50 | | | | | |--- feature_23 <= 40753.50 | | | | | | |--- feature_24 <= 2124.50 | | | | | | | |--- feature_29 <= 1062.50 | | | | | | | | |--- feature_23 <= 37342.50 | | | | | | | | | |--- feature_29 <= 1014.50 | | | | | | | | | | |--- feature_25 <= 131.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- feature_25 > 131.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- feature_29 > 1014.50 | | | | | | | | | | |--- feature_32 <= 1957.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- feature_32 > 1957.50 | | | | | | | | | | | |--- class: 20 | | | | | | | | |--- feature_23 > 37342.50 | | | | | | | | | |--- feature_28 <= 1831.00 | | | | | | | | | | |--- feature_29 <= 946.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- feature_29 > 946.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- feature_28 > 1831.00 | | | | | | | | | | |--- feature_33 <= 1894.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_33 > 1894.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | |--- feature_29 > 1062.50 | | | | | | | | |--- feature_32 <= 2106.00 | | | | | | | | | |--- feature_25 <= 67.00 | | | | | | | | | | |--- class: 21 | | | | | | | | | |--- feature_25 > 67.00 | | | | | | | | | | |--- feature_33 <= 2021.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_33 > 2021.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- feature_32 > 2106.00 | | | | | | | | | |--- class: 19 | | | | | | |--- feature_24 > 2124.50 | | | | | | | |--- feature_32 <= 1913.50 | | | | | | | | |--- class: 16 | | | | | | | |--- feature_32 > 1913.50 | | | | | | | | |--- feature_23 <= 30719.50 | | | | | | | | | |--- class: 20 | | | | | | | | |--- feature_23 > 30719.50 | | | | | | | | | |--- class: 23 | | | | | |--- feature_23 > 40753.50 | | | | | | |--- feature_1 <= 0.50 | | | | | | | |--- feature_33 <= 1838.00 | | | | | | | | |--- feature_30 <= 2678.00 | | | | | | | | | |--- class: 22 | | | | | | | | |--- feature_30 > 2678.00 | | | | | | | | | |--- feature_30 <= 2811.54 | | | | | | | | | | |--- class: 23 | | | | | | | | | |--- feature_30 > 2811.54 | | | | | | | | | | |--- class: 16 | | | | | | | |--- feature_33 > 1838.00 | | | | | | | | |--- feature_23 <= 56701.50 | | | | | | | | | |--- feature_24 <= 2186.50 | | | | | | | | | | |--- feature_33 <= 1979.50 | | | | | | | | | | | |--- class: 20 | | | | | | | | | | |--- feature_33 > 1979.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- feature_24 > 2186.50 | | | | | | | | | | |--- class: 16 | | | | | | | | |--- feature_23 > 56701.50 | | | | | | | | | |--- feature_28 <= 2026.00 | | | | | | | | | | |--- class: 21 | | | | | | | | | |--- feature_28 > 2026.00 | | | | | | | | | | |--- class: 23 | | | | | | |--- feature_1 > 0.50 | | | | | | | |--- feature_29 <= 1278.50 | | | | | | | | |--- feature_23 <= 44342.50 | | | | | | | | | |--- feature_29 <= 1047.00 | | | | | | | | | | |--- feature_23 <= 43461.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- feature_23 > 43461.00 | | | | | | | | | | | |--- class: 20 | | | | | | | | | |--- feature_29 > 1047.00 | | | | | | | | | | |--- feature_34 <= 176.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_34 > 176.00 | | | | | | | | | | | |--- class: 21 | | | | | | | | |--- feature_23 > 44342.50 | | | | | | | | | |--- feature_28 <= 1615.00 | | | | | | | | | | |--- class: 15 | | | | | | | | | |--- feature_28 > 1615.00 | | | | | | | | | | |--- feature_24 <= 1839.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_24 > 1839.00 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | |--- feature_29 > 1278.50 | | | | | | | | |--- class: 20 | | | | |--- feature_28 > 2057.50 | | | | | |--- feature_28 <= 2369.00 | | | | | | |--- feature_24 <= 2098.50 | | | | | | | |--- feature_28 <= 2240.00 | | | | | | | | |--- feature_0 <= 0.50 | | | | | | | | | |--- feature_25 <= 105.50 | | | | | | | | | | |--- feature_28 <= 2075.00 | | | | | | | | | | | |--- class: 20 | | | | | | | | | | |--- feature_28 > 2075.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- feature_25 > 105.50 | | | | | | | | | | |--- feature_24 <= 2089.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- feature_24 > 2089.00 | | | | | | | | | | | |--- class: 15 | | | | | | | | |--- feature_0 > 0.50 | | | | | | | | | |--- feature_34 <= 169.50 | | | | | | | | | | |--- feature_30 <= 2992.23 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_30 > 2992.23 | | | | | | | | | | | |--- class: 20 | | | | | | | | | |--- feature_34 > 169.50 | | | | | | | | | | |--- feature_32 <= 1982.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- feature_32 > 1982.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | |--- feature_28 > 2240.00 | | | | | | | | |--- feature_29 <= 738.00 | | | | | | | | | |--- feature_33 <= 1938.50 | | | | | | | | | | |--- class: 21 | | | | | | | | | |--- feature_33 > 1938.50 | | | | | | | | | | |--- feature_31 <= 4844.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_31 > 4844.50 | | | | | | | | | | | |--- class: 15 | | | | | | | | |--- feature_29 > 738.00 | | | | | | | | | |--- feature_26 <= 120.50 | | | | | | | | | | |--- class: 15 | | | | | | | | | |--- feature_26 > 120.50 | | | | | | | | | | |--- feature_2 <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- feature_2 > 0.50 | | | | | | | | | | | |--- class: 15 | | | | | | |--- feature_24 > 2098.50 | | | | | | | |--- feature_31 <= 5014.00 | | | | | | | | |--- feature_29 <= 875.50 | | | | | | | | | |--- class: 23 | | | | | | | | |--- feature_29 > 875.50 | | | | | | | | | |--- class: 21 | | | | | | | |--- feature_31 > 5014.00 | | | | | | | | |--- feature_32 <= 1877.00 | | | | | | | | | |--- class: 16 | | | | | | | | |--- feature_32 > 1877.00 | | | | | | | | | |--- feature_33 <= 1945.50 | | | | | | | | | | |--- feature_29 <= 1019.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- feature_29 > 1019.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- feature_33 > 1945.50 | | | | | | | | | | |--- feature_23 <= 40939.50 | | | | | | | | | | | |--- class: 21 | | | | | | | | | | |--- feature_23 > 40939.50 | | | | | | | | | | | |--- class: 15 | | | | | |--- feature_28 > 2369.00 | | | | | | |--- feature_27 <= 4.50 | | | | | | | |--- feature_31 <= 5440.00 | | | | | | | | |--- feature_26 <= 209.50 | | | | | | | | | |--- feature_34 <= 217.00 | | | | | | | | | | |--- class: 15 | | | | | | | | | |--- feature_34 > 217.00 | | | | | | | | | | |--- feature_29 <= 649.50 | | | | | | | | | | | |--- class: 15 | | | | | | | | | | |--- feature_29 > 649.50 | | | | | | | | | | | |--- class: 21 | | | | | | | | |--- feature_26 > 209.50 | | | | | | | | | |--- class: 21 | | | | | | | |--- feature_31 > 5440.00 | | | | | | | | |--- feature_31 <= 5584.50 | | | | | | | | | |--- class: 19 | | | | | | | | |--- feature_31 > 5584.50 | | | | | | | | | |--- class: 21 | | | | | | |--- feature_27 > 4.50 | | | | | | | |--- feature_33 <= 1874.00 | | | | | | | | |--- class: 23 | | | | | | | |--- feature_33 > 1874.00 | | | | | | | | |--- feature_34 <= 187.00 | | | | | | | | | |--- class: 21 | | | | | | | | |--- feature_34 > 187.00 | | | | | | | | | |--- feature_13 <= 0.50 | | | | | | | | | | |--- class: 21 | | | | | | | | | |--- feature_13 > 0.50 | | | | | | | | | | |--- class: 15 | | | |--- feature_26 > 218.50 | | | | |--- feature_33 <= 1656.50 | | | | | |--- feature_33 <= 1577.50 | | | | | | |--- class: 11 | | | | | |--- feature_33 > 1577.50 | | | | | | |--- feature_23 <= 77525.50 | | | | | | | |--- class: 25 | | | | | | |--- feature_23 > 77525.50 | | | | | | | |--- feature_31 <= 4684.50 | | | | | | | | |--- class: 25 | | | | | | | |--- feature_31 > 4684.50 | | | | | | | | |--- class: 23 | | | | |--- feature_33 > 1656.50 | | | | | |--- feature_0 <= 0.50 | | | | | | |--- feature_29 <= 889.00 | | | | | | | |--- feature_12 <= 0.50 | | | | | | | | |--- class: 15 | | | | | | | |--- feature_12 > 0.50 | | | | | | | | |--- class: 22 | | | | | | |--- feature_29 > 889.00 | | | | | | | |--- class: 21 | | | | | |--- feature_0 > 0.50 | | | | | | |--- feature_28 <= 1988.50 | | | | | | | |--- feature_34 <= 171.50 | | | | | | | | |--- class: 25 | | | | | | | |--- feature_34 > 171.50 | | | | | | | | |--- class: 23 | | | | | | |--- feature_28 > 1988.50 | | | | | | | |--- feature_27 <= 4.50 | | | | | | | | |--- feature_31 <= 4747.00 | | | | | | | | | |--- class: 23 | | | | | | | | |--- feature_31 > 4747.00 | | | | | | | | | |--- feature_29 <= 689.00 | | | | | | | | | | |--- class: 15 | | | | | | | | | |--- feature_29 > 689.00 | | | | | | | | | | |--- class: 21 | | | | | | | |--- feature_27 > 4.50 | | | | | | | | |--- feature_31 <= 4578.50 | | | | | | | | | |--- class: 25 | | | | | | | | |--- feature_31 > 4578.50 | | | | | | | | | |--- feature_24 <= 1936.50 | | | | | | | | | | |--- feature_25 <= 180.00 | | | | | | | | | | | |--- class: 21 | | | | | | | | | | |--- feature_25 > 180.00 | | | | | | | | | | | |--- class: 23 | | | | | | | | | |--- feature_24 > 1936.50 | | | | | | | | | | |--- feature_34 <= 49.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- feature_34 > 49.50 | | | | | | | | | | | |--- class: 23 | | |--- feature_33 > 2112.00 | | | |--- feature_32 <= 1982.00 | | | | |--- feature_30 <= 3068.08 | | | | | |--- feature_24 <= 2121.50 | | | | | | |--- class: 20 | | | | | |--- feature_24 > 2121.50 | | | | | | |--- class: 19 | | | | |--- feature_30 > 3068.08 | | | | | |--- feature_23 <= 51135.50 | | | | | | |--- feature_25 <= 81.00 | | | | | | | |--- class: 20 | | | | | | |--- feature_25 > 81.00 | | | | | | | |--- class: 19 | | | | | |--- feature_23 > 51135.50 | | | | | | |--- class: 15 | | | |--- feature_32 > 1982.00 | | | | |--- feature_23 <= 53601.50 | | | | | |--- feature_29 <= 2054.50 | | | | | | |--- feature_30 <= 2812.74 | | | | | | | |--- feature_28 <= 1927.00 | | | | | | | | |--- class: 19 | | | | | | | |--- feature_28 > 1927.00 | | | | | | | | |--- class: 20 | | | | | | |--- feature_30 > 2812.74 | | | | | | | |--- class: 19 | | | | | |--- feature_29 > 2054.50 | | | | | | |--- class: 20 | | | | |--- feature_23 > 53601.50 | | | | | |--- feature_24 <= 2142.00 | | | | | | |--- class: 15 | | | | | |--- feature_24 > 2142.00 | | | | | | |--- class: 19
fig = plt.figure(figsize=(25, 20))
decision_tree_plt = tree.plot_tree(decision_tree, feature_names=feature_names, class_names=cars["Produktgruppe"].unique(), filled=True)
Wiederholen Sie die Teilaufgaben 1. bis 5. des Entscheidungsbaums für einen Random Forest. Vergelichen Sie die Performance der beiden Verfahren.
Das Trainieren des Random Forest, hat ein wenig länger gedauert als der Entscheidungsbaum.
# Train random forest
random_forest = random_forest.fit(X_train, y_train)
# Predict the response for test data
random_forest_y_pred = random_forest.predict(X_test)
Hier kann man beobachten, dass der Random Forest sehr viel genauere Ergebnisse liefert. Manche haben sogar eine perfekte Precision.
# Create classification report
report = classification_report(y_test, random_forest_y_pred, target_names=cars["Produktgruppe"].unique())
print(report)
precision recall f1-score support
T5-Klasse Pkw 0.88 0.91 0.90 82
Cabrio ab Mittelklasse 0.97 0.99 0.98 106
Cabrio bis Kompaktklasse 0.91 0.93 0.92 138
Minicars 1.00 0.88 0.93 32
Kompakt-SUV / Geländewagen 0.98 0.92 0.95 66
Sportcabrio / Targa 0.82 0.65 0.72 48
Luxusklasse Cabrio 0.94 0.95 0.95 433
Coupé ab Mittelklasse 0.88 0.81 0.84 587
Sportcoupé 0.84 0.85 0.84 505
Luxusklasse Coupé 0.75 0.67 0.71 9
Coupé bis Kompaktklasse 0.84 0.64 0.73 25
untere Mittelklasse / Kompaktklasse 0.91 0.93 0.92 105
Kleinwagen 0.96 0.70 0.81 37
Sprinter-Klasse 0.99 0.99 0.99 302
T5-Klasse Lkw 0.97 0.97 0.97 1207
Kleintransporter / Lkw 0.90 0.97 0.93 103
Mittelklasse 1.00 1.00 1.00 16
obere Mittelklasse 0.74 0.67 0.70 76
Kleintransporter / Pkw 0.56 0.60 0.58 53
Luxusklasse Limousine / Kombi 1.00 1.00 1.00 163
Motorcaravan 0.95 0.96 0.95 162
Pickup 0.95 0.91 0.93 233
große SUV / Geländewagen 0.94 0.92 0.93 90
kleine SUV / Geländewagen 0.98 0.96 0.97 261
mittlere SUV / Geländewagen 0.88 0.85 0.86 176
Kompaktvan 0.89 0.89 0.89 590
Microvan / Minivan 0.94 0.95 0.95 332
Van 0.92 0.96 0.94 1322
accuracy 0.92 7259
macro avg 0.90 0.87 0.89 7259
weighted avg 0.92 0.92 0.92 7259
Die Diagonale ist hier viel besser zu erkennen und es scheinen deutlich weniger Fehler zu passieren.
# Create confusion matrix
random_forest_cm = confusion_matrix(y_test, random_forest_y_pred)
random_forest_cmd = ConfusionMatrixDisplay(random_forest_cm)
figure, axes = plt.subplots(figsize=(19, 14))
axes.grid(False)
random_forest_cmd.plot(ax=axes)
<sklearn.metrics._plot.confusion_matrix.ConfusionMatrixDisplay at 0x185ec790d60>
Im Vergleich zum DecisionTree, ist die Accuracy auf unterschiedliche Testdaten deutlich höher. Sie bewegt sich zwischen 74% und 84%. Das bedeutet, dass dieses Model wahrscheinlich auch auf unbekannte Daten eine hohe Genauigkeit haben könnte.
# 10x cross validation
random_forest_score = cross_val_score(random_forest, X, y, cv=10)
print(random_forest_score)
[0.76239669 0.7392562 0.77066116 0.75702479 0.78875568 0.75196362 0.77346011 0.82017363 0.8354692 0.76560562]
Auch beim Random Forest sind 3 Features in ihrer Wichtigkeit dominant. Jedoch bekommen auch die anderen, vergleichsweise mehr Gewichtung. Daher ist auch der Maximalwert kleiner.
# Feature importances
print(random_forest.feature_importances_)
plt.bar(np.arange(len(random_forest.feature_importances_)), random_forest.feature_importances_)
plt.xticks(np.arange(len(decision_tree.feature_importances_)), feature_names, rotation='vertical', fontsize = 'x-small')
plt.show()
[9.11392357e-03 1.15454879e-02 9.18413800e-03 2.44579932e-04 2.49533695e-04 1.16006587e-03 3.03638966e-05 2.66170256e-04 6.94864072e-05 1.03393347e-05 1.22613838e-05 1.26979537e-08 3.93598920e-03 4.00666734e-03 2.08413022e-04 5.89467982e-06 8.06125987e-04 2.79693847e-04 2.66904574e-04 1.59776541e-03 7.06969700e-04 3.95088274e-03 4.26789246e-03 7.99655869e-02 4.61350570e-02 4.78757246e-02 4.53549291e-02 4.93981106e-02 9.18300041e-02 8.25639606e-02 1.17283637e-01 1.32840401e-01 5.62443675e-02 1.48369437e-01 5.02192229e-02]
In diesem Teilversuch soll aus den Eingabemerkmalen
"CCM","HST PS", "Anzahl der Türen", "Leergewicht", "Zuladung", "Länge", "Breite", "Höhe"
die Zielvariable
CO2-Emissionen
geschätzt werden. Hierzu soll ein möglichst gutes Regressionsmodell trainiert werden.
# 1. Scatterplot from all features with goal variable
features = ["CCM","HST PS", "Anzahl der Türen", "Leergewicht", "Zuladung", "Länge", "Breite", "Höhe"]
goal_variable = "CO2-Emissionen"
for feature in features:
korelation = px.scatter(cars, x=feature, y=goal_variable)
korelation.show()
Die Korrelation zwischen Türen und C02-Emission, ist wie zu erwarten quasi nicht existent. Am Stärksten korrelieren HST PS und CCM. Überraschenderweise besteht auch eine leichte Korrelation zwischen Breite und Emission. Dass sehr hohe Autos tendenziell weniger Co-Emissionen verursachen, war uns auch nicht bewusste und lässt sich nur schwer erklären.
Bei manchen Graphen ist die Korrelation nicht so gut zu erkennen. Das liegt daran, dass die Daten sehr stark abseits liegende Ausreißer haben und somit der Graph gestaucht ist, wodurch die Korrelation nicht so gut zu erkennen ist. Ein Beispiel hierfür ist das Leergewicht.
X und die Zielvariable dem 1-dimensionalen Array y zu.X und y eine Partitionierung in Trainings- und Testdaten durch, wieder im Verhältnis 70/30.Wir bereiten die Daten auch wie zuvor auf, können aber auf das One-Hot-Encoding verzichten, dawir nur numerische Daten haben.
# Prepare X - Input data and y - goal variable
# Fill feature values from cars dataframe
feature_values = []
for column in features:
feature_values.append(cars[column])
# Set X
X = np.array(feature_values).transpose() # Transpose to switch axes
print("X:", X)
# Set Y
y = np.array(cars[goal_variable])
print("y:", y)
X: [[1896 154 4 ... 4852 1849 2019] [1990 148 4 ... 4859 1827 1938] [1943 150 4 ... 4788 1823 1990] ... [2018 240 5 ... 5033 2011 1592] [1968 238 5 ... 4795 1941 1602] [2019 239 5 ... 5022 1865 1771]] y: [218. 218. 218. ... 153. 153. 162.]
Wir spliten wieder mit train_test_split die Daten und der random state ist aus den oben genannten Gründen wieder auf 0 gesetzt.
# Data Encoding 2.
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7, test_size=0.3, random_state=0)
print("Anzahl X Trainingsdaten", len(X_train))
print("Anzahl X Testdaten", len(X_test))
print("Anzahl y Trainingsdaten", len(y_train))
print("Anzahl y Testdaten", len(y_test))
Anzahl X Trainingsdaten 16935 Anzahl X Testdaten 7259 Anzahl y Trainingsdaten 16935 Anzahl y Testdaten 7259
Die Skalierung muss sowohl auf Trainings- als auch auf Testdaten ausgeführt werden. Warum darf die Skalierung erst nach dem Split in die beiden Partitionen ausgeführt werden? Worauf ist zu achten?
Es darf erst nach dem Split skaliert werden da sich die Funktion auf die gegebene Reichweite und Größe der Daten bezieht. Da wir nicht sicher wissen, wie echte Daten aussehen könnten, hilft das spliten davor, ein verlässlicheres Ergebnis über die Qualität zu bekommen. Dabei ist zu beachten, dass der maximale Wert zu 1 wird und der minimale zu 0. Darum haben wir auch nicht verstanden warum wir auch y skalieren sollten. Online wird immer nur x skaliert. Um y zu skalieren benötigten wir auch das reshape.
# Scaling of X and y, trainings and test data
# Initialize Scaler
scalerX_train = MinMaxScaler()
scalerX_test = MinMaxScaler()
scalerY_train = MinMaxScaler()
scalerY_test = MinMaxScaler()
X_train = scalerX_train.fit_transform(X_train)
X_test = scalerX_test.fit_transform(X_test)
y_train = scalerY_train.fit_transform(y_train.reshape(-1, 1))
y_test = scalerY_test.fit_transform(y_test.reshape(-1, 1))
y_train = y_train.flatten()
y_test = y_test.flatten()
print("X_train", X_train)
print("X_test", X_test)
print("y_train", y_train)
print("y_test", y_test)
X_train [[0.12119149 0.1559633 1. ... 0.41121495 0.47576531 0.26478568] [0.01225532 0.05963303 1. ... 0.44358331 0.25382653 0.20021704] [0.1227234 0.10703364 1. ... 0.35673581 0.47576531 0.2116115 ] ... [0.20561702 0.20948012 0.66666667 ... 0.49259175 0.51913265 0.12533912] [0.0973617 0.03975535 0.33333333 ... 0.31023478 0.34183673 0.15192621] [0.20085106 0.117737 0.66666667 ... 0.74720766 0.79591837 0.79164406]] X_test [[0.11473365 0.11793215 1. ... 0.52696263 0.63403782 0.40524434] [0.20399074 0.27786753 1. ... 0.52947078 0.46496107 0.14421931] [0.02209157 0.09854604 1. ... 0.42864309 0.45272525 0.14064362] ... [0.21521468 0.07915994 0.66666667 ... 0.70729872 0.76084538 0.96066746] [0.2066631 0.13731826 0.33333333 ... 0.41660396 0.38932147 0.15733015] [0.20238732 0.14701131 1. ... 0.51693002 0.4705228 0.11442193]] y_train [0.3024453 0.3024453 0.41827542 ... 0.27670528 0.31274131 0.38223938] y_test [0.3318556 0.37935402 0.36985434 ... 0.47435085 0.30335655 0.35718809]
Führen Sie die folgenden Teilaufgaben sowohl für ein Single Layer Perceptron als auch für ein Multi Layer Perceptron mit 20 Neuronen in der Hidden-Schicht durch. Vergleichen Sie am Ende die Performance der beiden Verfahren.
# Calculate and Prints Regression Metrics
def determineRegressionMetrics(y_test,y_pred,title=""):
mse = mean_squared_error(y_test, y_pred)
mad = mean_absolute_error(y_test, y_pred)
rmsle=np.sqrt(mean_squared_error(np.log(y_test+1),np.log(y_pred+1)))# +1 for avoiding log(0)
r2=r2_score(y_test, y_pred)
med=median_absolute_error(y_test, y_pred)
print(title)
print("Mean absolute error =", round(mad, 2))
print("Mean squared error =", round(mse, 2))
print("Median absolute error =", round(med, 2))
print("R2 score =", round(r2, 2))
print("Root Mean Squared Logarithmic Error =",rmsle)
# Train Algorithm SLP
slp_sgdr = SGDRegressor()
slp_sgdr.fit(X_train, y_train)
SGDRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
SGDRegressor()
# Train Algorithm MPL
mlp_sgdr = MLPRegressor(hidden_layer_sizes=(20,))
mlp_sgdr.fit(X_train, y_train)
MLPRegressor(hidden_layer_sizes=(20,))In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
MLPRegressor(hidden_layer_sizes=(20,))
# Predict with test data SLP
slp_y_pred = slp_sgdr.predict(X_test)
print(slp_y_pred)
[0.31822368 0.32574023 0.25280027 ... 0.41490164 0.30298384 0.29904303]
# Predict with test data MLP
mlp_y_pred = mlp_sgdr.predict(X_test)
print(mlp_y_pred)
[0.31929034 0.31420597 0.28017305 ... 0.40990978 0.29920485 0.28734179]
Der Mean Absolute Error ist der Durchschnitt der Differenz von den eigentlichen Ergebnissen und den vorhergesagten Ergebnissen, der Testdaten. Da diese Angabe sehr genau ist, kann sie möglicherweise nicht auf echte Daten hohe Aussagekraft haben.
Der Mean Squared Error nimmt den Fehler (die Differenz) auf die Daten ins Quadrat. Es wird oft genutzt um die Performance des gelernten Modells zu testen. Ein Wert nahe bei null bedeutet dass das modell sehr akkurat ist.
Der Root Mean Squared Error nutzt ein ähnliches Verfahren wie der MSE nur logarythmisch, un Verwendet dabei aber den MSE. Umso kleiner der Wert, umso mehr fittet es den Daten
Der R Square sagt aus wie gut das Model auf die Daten fittet. Der Rückgabewert liegt zwischen 1 und 0, Wobei 1 bedeutet dass die Reggresionslinie perfekt an die Daten passt -> Overfitting
Quelle: https://www.studytonight.com/post/what-is-mean-squared-error-mean-absolute-error-root-mean-squared-error-and-r-squared, https://www.statology.org/what-is-a-good-rmse/
Weshalb wir bei dem SLP einen negativen R2 bekommen, konnten wir nicht sagen. Stack overflow meinte dass unser Model nicht gut auf die Daten fittet und das wir einen Intercept setzen müssten. Quelle: https://stats.stackexchange.com/questions/183265/what-does-negative-r-squared-mean Um das genauer zu verstehen fehlt uns leider die Zeit. Je nachdem wie oft es durchläuft verändert sich der R2 Wert stark, während die andern Werte relativ konstant bleiben.
Das MLP hat einen positiven R2 und fittet somit besser auf die Daten, jedoch nicht so gut dass man von Overfitting sprechen müsste. Ob es zu schlecht fittet ist schwer zu sagen, da der Error eigentlich relativ klein war.
Beide Modelle sind in etwa ähnlich gut, wenn man den MSE von 0.01 betrachtet. Da aber der R2 Wert beim MLP konstant Höher ist, bewerten wie es als besser wie das SLP.
# 3. Check quality of model SLP
determineRegressionMetrics(y_test, slp_y_pred, title="Single Layer Perceptron")
Single Layer Perceptron Mean absolute error = 0.09 Mean squared error = 0.01 Median absolute error = 0.08 R2 score = -0.2 Root Mean Squared Logarithmic Error = 0.08205369959883177
# 3. Check quality of model MLP
determineRegressionMetrics(y_test, mlp_y_pred, title="Multi Layer Perceptron")
Multi Layer Perceptron Mean absolute error = 0.09 Mean squared error = 0.01 Median absolute error = 0.08 R2 score = -0.13 Root Mean Squared Logarithmic Error = 0.07981073859187716
Für ein Multi Layer Perceptron soll eine Hyperparameteroptimierung durchgeführt werden. Ziel ist es innerhalb der unten vorgegebenen Wertebereiche für die Hyperparameter hidden_layer_sizes, activation und learning_rate die beste Konfiguration zu finden. Hierzu kann entweder GridSearchCV oder RandomizedSearchCV eingesetzt werden. GridSearchCV testet einfach alle Konfigurationen durch, benötigt daher aber viel Zeit. RandomizedSearchCV geht heuristisch und damit schneller durch den Suchraum. Wenden Sie eines dieser beiden Verfahren an, um für das unten gegebene Parameter-Grid die optimale Konfiguration zu finden. Welches ist die optimale Konfiguration und zu welchem neg_mean_absolute_error führt diese?
param_grid = [{'hidden_layer_sizes': [(10,),(20,),(30,),(40,),(50,),(100,),(10,10)],
'activation': ["logistic", "tanh", "relu"],
'learning_rate': ["constant", "invscaling", "adaptive"]}]
param_grid
[{'hidden_layer_sizes': [(10,), (20,), (30,), (40,), (50,), (100,), (10, 10)],
'activation': ['logistic', 'tanh', 'relu'],
'learning_rate': ['constant', 'invscaling', 'adaptive']}]
Welches ist die optimale Konfiguration und zu welchem neg_mean_absolute_error führt diese?
Die Berechnung dauerte wie in der Aufgabenbeschreibung bereits erklärt vergleichsweise sehr lange und führte zu unterschiedlichen Ergebnissen. Beim Schreiben dieses Berichts war es {'activation': 'relu', 'hidden_layer_sizes': (100,), 'learning_rate': 'adaptive'}
Das gibt man nun in das MLP mit und es kommt ein mean_absolute_error von ca. 0.1 raus. Da der neg_mean_absolute_error in der Aussage zum mean_absolute_error keinen Unterschied macht, können wir diesen hier auch verwenden. Der neg_mean_absolute_error wäre also -0.1. Ein relativ gutes Ergebnis.
# Init new Regressor
mlp_sgdr = MLPRegressor()
# Grid Search CV
gscv = GridSearchCV(estimator=mlp_sgdr, param_grid=param_grid)
search_gscv = gscv.fit(X_train, y_train)
# Print the best params according to Grid Search CV
print(search_gscv.best_params_)
{'activation': 'relu', 'hidden_layer_sizes': (100,), 'learning_rate': 'invscaling'}
### Test with Result Grid Search CV
# 1. Train Algorithm MPL
mlp_sgdr = MLPRegressor(hidden_layer_sizes=(100,), activation="relu", learning_rate="invscaling")
mlp_sgdr.fit(X_train, y_train)
# 2. Predict with test data MLP
mlp_optimized_y_pred = mlp_sgdr.predict(X_test)
# 3. Check quality of model MLP
determineRegressionMetrics(y_test, mlp_optimized_y_pred, title="Multi Layer Perceptron")
Multi Layer Perceptron Mean absolute error = 0.1 Mean squared error = 0.01 Median absolute error = 0.09 R2 score = -0.25 Root Mean Squared Logarithmic Error = 0.08591679361443827
Welches ist die optimale Konfiguration und zu welchem neg_mean_absolute_error führt diese?
Die Berechnung dauerte wie in der Aufgabenbeschreibung bereits erklärt nicht sehr lange und führte zu unterschiedlichen Ergebnissen. Beim Schreiben dieses Berichts war es {'learning_rate': 'invscaling', 'hidden_layer_sizes': (40,), 'activation': 'relu'}
Das gibt man nun in das MLP mit und es kommt ein mean_absolute_error von ca. 0.07 raus. Der neg_mean_absolute_error wäre also -0.07. Ein besseres Ergebnis, als GridSearchCV.
# Init new Regressor
mlp_sgdr = MLPRegressor()
# Randomized Search CV
randomized_search_cv = RandomizedSearchCV(estimator=mlp_sgdr, param_distributions=param_grid)
search_randomized = randomized_search_cv.fit(X_train, y_train)
print(search_randomized.best_params_)
{'learning_rate': 'invscaling', 'hidden_layer_sizes': (100,), 'activation': 'relu'}
### Test with Result Randomized Search CV
# 1. Train Algorithm MPL
mlp_sgdr = MLPRegressor(hidden_layer_sizes=(40,), activation="relu", learning_rate="invscaling")
mlp_sgdr.fit(X_train, y_train)
# 2. Predict with test data MLP
mlp_optimized_y_pred = mlp_sgdr.predict(X_test)
# 3. Check quality of model MLP
determineRegressionMetrics(y_test, mlp_optimized_y_pred, title="Multi Layer Perceptron")
Multi Layer Perceptron Mean absolute error = 0.06 Mean squared error = 0.01 Median absolute error = 0.04 R2 score = 0.44 Root Mean Squared Logarithmic Error = 0.056513212107366456
Wir haben eine Datenbank aus einem unserer anderen Hochschulprojekte, hierbei handelt es sich um rund 20 Tausend Handy Daten, Modell, Marke, Gewicht, Release date, OS, Platform und Kategory.
Ein Potenzielles Ziel ist hierbei, anhand des Gewichtes, die Kategorie vorherzusagen.
# Get csv data
phones = pd.read_csv("./Phonedata.csv")
# Strip Data that is unnecessary
phones.drop(columns=['id'], inplace=True)
phones.head()
| model | brand | weight | released | operatingSystem | platform | deviceCategory | |
|---|---|---|---|---|---|---|---|
| 0 | MyPal A620 | Asus | 141 g | 2003 Jun | Microsoft Windows Mobile 2003 for Pocket PC Pr... | Windows (mobile-class) | PDA |
| 1 | Jornada 720 | Hewlett-Packard | 510 g | 2000 Sep | Microsoft Handheld PC 2000 (Galileo) | Windows (mobile-class) | Palmtop |
| 2 | iPAQ H3630 / H3635 / H3650 | Compaq | 174 g | 2000 Aug. | Microsoft Pocket PC 2000 (Rapier) | Windows (mobile-class) | PDA |
| 3 | Cassiopeia E-200 | Casio | 190 g | 2001 Dec | Microsoft Pocket PC 2002 Premium Edition (Merlin) | Windows (mobile-class) | PDA |
| 4 | Jornada 728 | Hewlett-Packard | 515 g | 2002 Jan. | Microsoft Handheld PC 2000 (Galileo) | Windows (mobile-class) | Palmtop |
# Describe
display(phones.describe()) # Realizing there are duplicates
phones.drop_duplicates(inplace=True)
for column in phones:
print(f"{column}: ", phones[column].nunique())
# After deleting duplicates, there are still not uniques in models.
# That's because sometimes a model comes with another OS which makes it not a duplicate.
| model | brand | weight | released | operatingSystem | platform | deviceCategory | |
|---|---|---|---|---|---|---|---|
| count | 19443 | 19443 | 19443 | 19443 | 19443 | 19443 | 19443 |
| unique | 19366 | 455 | 1132 | 2909 | 305 | 13 | 11 |
| top | N1 | Samsung | 145 g | 2018 Jun | Google Android 10 (Q) | Android | Smartphone |
| freq | 3 | 2945 | 371 | 136 | 1560 | 15827 | 16424 |
model: 19366 brand: 455 weight: 1132 released: 2909 operatingSystem: 305 platform: 13 deviceCategory: 11
# Convert weights to numeral
if(phones['weight'].dtype == 'object'):
phones['weight'] = phones['weight'].str.replace(' g', '').astype(float)
phones.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 19435 entries, 0 to 19442 Data columns (total 7 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 model 19435 non-null object 1 brand 19435 non-null object 2 weight 19435 non-null float64 3 released 19435 non-null object 4 operatingSystem 19435 non-null object 5 platform 19435 non-null object 6 deviceCategory 19435 non-null object dtypes: float64(1), object(6) memory usage: 1.2+ MB
# Counts per Brand
phones['brand'].value_counts()[:20].plot(kind='bar', figsize=(10, 5))
# Since there are 455 brands, we will only look at the top 20
<AxesSubplot:>
# Group by deviceCategory for top 20 brands
top_20_brands_names = phones['brand'].value_counts().sort_values(ascending=False)[:20].index
top_20_brands = phones[phones['brand'].isin(top_20_brands_names)]
category_by_brand = top_20_brands.groupby(['brand','deviceCategory']).size().unstack()
category_by_brand.plot(kind='barh',stacked=True, figsize=(10, 7))
<AxesSubplot:ylabel='brand'>
# This is kindof obvious, since we look at a phone database, most entries are smartphones.
# Smartphones by brand might be interesting though.
smartphone_entries = phones[phones['deviceCategory'].isin(['Smartphone'])]
smartphone_entries_top_20_brands = smartphone_entries[smartphone_entries['brand'].isin(top_20_brands_names)]
smartphones_by_brand = smartphone_entries_top_20_brands.groupby(['brand']).size().sort_values(ascending=True)
smartphones_by_brand.plot(kind='barh',stacked=True, figsize=(10, 7))
<AxesSubplot:ylabel='brand'>
# Boxplot for brand and weight
sns.boxplot(y='brand', x='weight', data=smartphone_entries_top_20_brands)
sns.set(rc={'figure.figsize':(10, 12)})
'''This looks interesting, Lenovo has a smartphone that weighs 1.7 kg??The rest looks as expected, since the goal in the industry is to make the phone as light as possible. This seems to be a typo in the database, since 166.5g would be plausible, but not 1665g'''
'This looks interesting, Lenovo has a smartphone that weighs 1.7 kg??The rest looks as expected, since the goal in the industry is to make the phone as light as possible. This seems to be a typo in the database, since 166.5g would be plausible, but not 1665g'
# Scatter for brand and weight
px.scatter(phones, x='brand', y='weight', color='deviceCategory').show()
# Unfortunately, we only have one numerical column,
# which makes a scatter kind of useless and almost equivalent to a boxplot
# What we can see though, is that as expected the order of weights is seen in layers,
# from smartwatches in dark blue, over smartphones in green to tablets in cyan.
# Drop everything but weight, brand and deviceCategory
# Only look into Smartwatch, Smartphone, Tablet and notebook
reduced_phones = phones[['brand', 'deviceCategory', 'weight']]
reduced_phones = reduced_phones[reduced_phones['deviceCategory'].isin(['Smartwatch', 'Smartphone', 'Tablet', 'Notebook'])]
reduced_phones = reduced_phones[reduced_phones['brand'].isin(top_20_brands_names)]
# One hot encode categoric values
lb = LabelBinarizer()
le = LabelEncoder()
# deviceCategory = lb.fit_transform(reduced_phones['deviceCategory'])
# classes_deviceCategory = lb.classes_
# print(classes_deviceCategory)
brand = lb.fit_transform(reduced_phones['brand'])
classes_brand = lb.classes_
print(classes_brand)
# Get all feature names in correct order
feature_names = np.concatenate([["weight"], classes_brand])
# Concatenate categoric features with numeric features
numerical = reduced_phones['weight'].values.reshape(-1, 1)
categoric_one_hot = np.concatenate([brand], axis=1)
numerical = reduced_phones['weight'].values.reshape(-1, 1)
print(numerical)
X = np.concatenate([numerical, categoric_one_hot], axis=1)
# Label encode goal variable
y = le.fit_transform(reduced_phones['deviceCategory'])
print(y)
['Acer' 'Alcatel' 'Apple' 'Asus' 'BBK' 'HTC' 'Huawei' 'LG' 'Lenovo' 'Meizu' 'Motorola' 'Nokia' 'OnePlus' 'Oppo' 'RIM' 'Samsung' 'Sharp' 'Sony' 'Xiaomi' 'ZTE'] [[162.] [170.] [174.] ... [173.] [173.] [173.]] [1 1 1 ... 1 1 1]
# Training data
# Splits data into 30% test and 70% training data and dont get shuffled
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7, test_size=0.3, random_state=0)
# Create Tree classifer object
decision_tree = DecisionTreeClassifier()
random_forest = RandomForestClassifier()
# Train Decision Tree
decision_tree = decision_tree.fit(X_train, y_train)
# Predict the response for test data
decision_tree_y_pred = decision_tree.predict(X_test)
# Create classification report
report = classification_report(y_test, decision_tree_y_pred)
print(report)
# Create confusion matrix
decision_tree_cm = confusion_matrix(y_test, decision_tree_y_pred)
decision_tree_cmd = ConfusionMatrixDisplay(decision_tree_cm)
figure, axes = plt.subplots(figsize=(19, 14))
axes.grid(False)
decision_tree_cmd.plot(ax=axes)
# This looks like either we did something wrong,
# or the data is not good for classification,
# or maybe too good.
precision recall f1-score support
0 0.67 0.80 0.73 10
1 1.00 1.00 1.00 3714
2 0.97 1.00 0.99 37
3 0.95 0.96 0.96 395
accuracy 0.99 4156
macro avg 0.90 0.94 0.92 4156
weighted avg 0.99 0.99 0.99 4156
<sklearn.metrics._plot.confusion_matrix.ConfusionMatrixDisplay at 0x185874cd420>
# 10x cross validation
decision_tree_score = cross_val_score(decision_tree, X, y, cv=10)
print(decision_tree_score)
# Feature importances
print(decision_tree.feature_importances_)
plt.bar(np.arange(len(decision_tree.feature_importances_)), decision_tree.feature_importances_)
plt.xticks(np.arange(len(decision_tree.feature_importances_)), feature_names, rotation='vertical', fontsize = 'x-small')
plt.show()
[0.95670996 0.97907648 0.97546898 0.97761733 0.98050542 0.98916968 0.99061372 0.99422383 0.98772563 0.97472924] [9.70681384e-01 1.97639900e-03 1.22428402e-03 0.00000000e+00 2.81215109e-03 0.00000000e+00 2.33222096e-03 2.78288949e-03 2.61697354e-03 9.08696754e-04 0.00000000e+00 3.79857850e-03 8.24676535e-04 0.00000000e+00 2.76860236e-03 0.00000000e+00 5.87882326e-03 1.04379573e-03 0.00000000e+00 3.50524367e-04 0.00000000e+00]
# Train random forest
random_forest = random_forest.fit(X_train, y_train)
# Predict the response for test data
random_forest_y_pred = random_forest.predict(X_test)
# Create classification report
report = classification_report(y_test, random_forest_y_pred)
print(report)
# Create confusion matrix
random_forest_cm = confusion_matrix(y_test, random_forest_y_pred)
random_forest_cmd = ConfusionMatrixDisplay(random_forest_cm)
figure, axes = plt.subplots(figsize=(19, 14))
axes.grid(False)
random_forest_cmd.plot(ax=axes)
plt.show()
# 10x cross validation
random_forest_score = cross_val_score(random_forest, X, y, cv=10)
print(random_forest_score)
# Feature importances
print(random_forest.feature_importances_)
plt.bar(np.arange(len(random_forest.feature_importances_)), random_forest.feature_importances_)
plt.xticks(np.arange(len(decision_tree.feature_importances_)), feature_names, rotation='vertical', fontsize = 'x-small')
plt.show()
precision recall f1-score support
0 0.78 0.70 0.74 10
1 1.00 1.00 1.00 3714
2 0.97 1.00 0.99 37
3 0.95 0.98 0.97 395
accuracy 0.99 4156
macro avg 0.93 0.92 0.92 4156
weighted avg 0.99 0.99 0.99 4156
[0.96536797 0.97691198 0.98051948 0.97761733 0.98050542 0.99205776 0.99133574 0.99638989 0.98700361 0.97111913] [9.27837805e-01 3.06958602e-03 8.03462186e-04 3.56272025e-02 3.79270641e-03 1.18747332e-03 1.85535575e-03 1.31011241e-03 1.58053528e-03 8.98938127e-03 1.12655000e-04 2.99783309e-03 6.89876654e-04 1.43930286e-05 2.97150048e-03 1.41414996e-04 4.39164381e-03 3.34275368e-04 5.03859150e-04 9.03190290e-04 8.85738334e-04]
Fazit: Wir konnten sehr leicht das vorherig gelernte erneut Anwenden, einziger Nachteil ist, dass die Daten nicht wirklich gut für sowas waren, da es kaum Überschneidungen der Kategorien gibt. Smartwatches, Smartphones, Tablets und Notebooks sind alle sehr klar durchs Gewicht abgetrennt, darum ist es verständlich, dass die Wichtigkeit des Gewichts am höchsten ist.
Aber interessant ist, dass es abgesehen vom Gewicht, relevant zu sein scheint, ob die Marke Apple ist oder nicht; Zwar ist der Einfluss nicht groß, aber er ist vorhanden.